Unhandled Promise Rejections: The Tiny Mistake That Crashed Our Node.js App
I Built a Full Stack App Using Only Vibe Coding Prompts: Here’s What Happened
Kubernetes in the Enterprise
Over a decade in, Kubernetes is the central force in modern application delivery. However, as its adoption has matured, so have its challenges: sprawling toolchains, complex cluster architectures, escalating costs, and the balancing act between developer agility and operational control. Beyond running Kubernetes at scale, organizations must also tackle the cultural and strategic shifts needed to make it work for their teams.As the industry pushes toward more intelligent and integrated operations, platform engineering and internal developer platforms are helping teams address issues like Kubernetes tool sprawl, while AI continues cementing its usefulness for optimizing cluster management, observability, and release pipelines.DZone’s 2025 Kubernetes in the Enterprise Trend Report examines the realities of building and running Kubernetes in production today. Our research and expert-written articles explore how teams are streamlining workflows, modernizing legacy systems, and using Kubernetes as the foundation for the next wave of intelligent, scalable applications. Whether you’re on your first prod cluster or refining a globally distributed platform, this report delivers the data, perspectives, and practical takeaways you need to meet Kubernetes’ demands head-on.
Getting Started With CI/CD Pipeline Security
Java Caching Essentials
Automation isn’t optional at enterprise scale. It’s resilient by design. Kubernetes provides remarkable scalability and resilience , but when pods crash, even seasoned engineers struggle to translate complex and cryptic logs and events. This guide walks you through the spectrum of AI-powered root cause analysis and manual debugging, combining command-line reproducibility and predictive observability approaches. Introduction Debugging distributed systems is an exercise in controlled chaos. Kubernetes abstracts away deployment complexity, but those same abstractions can hide where things go wrong. The goal of this article is to provide a methodical, data-driven approach to debugging and then extend that process with AI and ML for proactive prevention. We’ll cover: Systematic triage of pod and node issues.Integrating ephemeral and sidecar debugging.Using ML models for anomaly detection.Applying AI-assisted Root Cause Analysis (RCA).Designing predictive autoscaling and compliance-safe observability. Step-by-Step Implementation Step 1: Inspect Pods and Events Start by collecting structured evidence before introducing automation or AI. Key commands: Shell kubectl describe pod <pod-name> kubectl logs <pod-name> -c <container> kubectl get events --sort-by=.metadata.creationTimestamp Interpretation checklist: Verify container state transitions (Waiting, Running, and Terminated).Identify patterns in event timestamps correlated with restarts, which often signal resource exhaustion.Capture ExitCode and Reason fields.Collect restart counts: Shell kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[*].restartCount}' AI extension: Feed logs and event summaries into an AI model (like GPT-4 or Claude) to quickly surface root causes: “Summarize likely reasons for this CrashLoopBackOff and list next diagnostic steps.” This step shifts engineers from reactive log hunting to structured RCA. Step 2: Ephemeral Containers for Live Diagnosis Ephemeral containers are your “on-the-fly” debugging environment. They let you troubleshoot without modifying the base image, which is essential in production environments. Command: Shell kubectl debug -it <pod-name> --image=busybox --target=<container> Inside the ephemeral shell: Check environment variables: env | sortInspect mounts: df -h && mount | grep appTest DNS: cat /etc/resolv.conf && nslookup google.comVerify networking: curl -I http://<service-name>:<port> AI tip: Feed ephemeral-session logs to an AI summarizer to auto-document steps for your incident management system, creating reusable knowledge. Step 3: Attach a Debug Sidecar (For Persistent Debugging) In environments without ephemeral containers (e.g., OpenShift or older clusters), add a sidecar container. Example YAML: YAML containers: - name: debug-sidecar image: nicolaka/netshoot command: ["sleep", "infinity"] Use cases: Network packet capture with tcpdump.DNS and latency verification with dig and curl.Continuous observability in CI environments. Enterprise note: At a large tech company, scale clusters, debugging sidecars are often deployed only in non-production namespaces for compliance. Step 4: Node-Level Diagnosis Pods inherit instability from their hosting nodes. Commands: Shell kubectl get nodes -o wide kubectl describe node <node-name> journalctl -u kubelet --no-pager -n 200 sudo crictl ps sudo crictl logs <container-id> Investigate: ResourcePressure (MemoryPressure, DiskPressure).Kernel throttling or CNI daemonset failures.Container runtime errors (containerd/CRI-O). AI layer: ML-based observability (e.g., Dynatrace Davis or Datadog Watchdog) can automatically detect anomalies such as periodic I/O latency spikes and recommend affected pods. Step 5: Storage and Volume Analysis Persistent Volume Claims (PVCs) can silently cause pod hangs. Diagnostic workflow: Check mounts: Shell kubectl describe pod <pod-name> | grep -i mount Inspect PVC binding: Shell kubectl get pvc <pvc-name> -o yaml Validate StorageClass and node access mode (RWO, RWX).Review node dmesg logs for mount failures. AI insight: Anomaly detection models can isolate repeating I/O timeout errors across nodes- clustering them to detect storage subsystem degradation early. Step 6: Resource Utilization and Automation Resource throttling leads to cascading restarts. Monitoring commands: Shell kubectl top pods kubectl top nodes Optimization: Fine-tune CPU and memory requests/limits.Use kubectl get hpa to confirm scaling thresholds.Implement custom metrics for queue depth or latency. HPA example: YAML apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: order-service-hpa spec: minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 Automation isn’t optional at enterprise scale. It’s resilient by design. Step 7: AI Augmented Debugging Pipelines AI is transforming DevOps from reactive incident response to proactive insight generation. Applications: Anomaly detection: Identify outlier metrics in telemetry streams.AI log summarization: Extract high-value signals from terabytes of text.Predictive scaling: Use regression models to forecast utilization.AI-assisted RCA: Rank potential causes with confidence scores. Example AI call: Shell cat logs.txt | openai api chat.completions.create \ -m gpt-4o-mini \ -g '{"role":"user","content":"Summarize probable root cause"}' These techniques minimize mean time to recovery (MTTR) and mean time to detection (MTTD). Step 8: AI-Powered Root Cause Analysis (RCA) Traditional RCA requires manual correlation across metrics and logs. AI streamlines this process. Approach: Cluster error signatures using unsupervised learning.Apply attention models to correlate metrics (CPU, latency, I/O).Rank potential causes with Bayesian confidence.Auto-generate timeline summaries for postmortems. Example workflow: Collect telemetry and store in Elastic AIOps.Run ML job to detect anomaly clusters.Feed summary to LLM to describe likely failure flow.Export insight to Jira or ServiceNow. This hybrid system merges deterministic data with probabilistic reasoning, ideal for financial or mission-critical clusters. Step 9: Predictive Autoscaling Reactive scaling waits for metrics to breach thresholds; predictive scaling acts before saturation. Implementation path: Gather historic CPU, memory, and request metrics.Train a regression model to forecast 15-minute utilization windows.Integrate predictions with Kubernetes HPA or KEDA.Validate performance using synthetic benchmarks. Example (conceptual): Python # pseudo-code for predictive HPA predicted_load = model.predict(metrics.last_30min()) if predicted_load > 0.75: scale_replicas(current + 2) At a large tech company, class clusters, predictive autoscaling can reduce latency incidents by 25–30%. Step 10: Compliance and Security in AI Debugging AI-driven pipelines must respect governance boundaries. Guidelines: Redact credentials and secrets before log ingestion.Use anonymization middleware for PII or transaction IDs.Apply least privilege RBAC for AI analysis components.Ensure model storage complies with data residency regulations. Security isn’t just about access - it’s about maintaining explainability in AI-assisted systems. Step 11: Common Failure Scenarios categorysymptomroot causefixRBACForbiddenMissing role permissionsAdd RoleBindingImageImagePullBackOffWrong registry secretUpdate and re-pullDNSTimeoutStale CoreDNS cacheRestart CoreDNSStorageVolumeMount failPVC unboundRebind PVCCrashRestart loopInvalid env varsCorrect configuration AI correlation engines now automate this table in real time, linking symptoms to resolution recommendations. Step 12: Real World Enterprise Example Scenario: A financial transaction service repeatedly fails post-deployment. Process: Logs reveal TLS handshake errors.AI summarizer highlights expired intermediate certificate.Jenkins assistant suggests reissuing the secret via cert-manager.Deployment revalidated successfully. Result: Incident time reduced from 90 minutes to 8 minutes - measurable ROI. Step 13: The Future of Autonomous DevOps The next wave of DevOps will be autonomous clusters capable of diagnosing and healing themselves. Emerging trends: Self-healing deployments using reinforcement learning.LLM-based ChatOps interfaces for RCA.Real-time anomaly explanation using SHAP and LIME interpretability.AI governance models ensuring ethical automation. Vision: The DevOps pipeline of the future isn’t just automated, it’s intelligent, explainable, and predictive. Conclusion Debugging Kubernetes efficiently is no longer about quick fixes, and it’s about building feedback systems that learn. Modern debugging workflow: InspectDiagnoseAutomateApply AI RCAPredict When humans and AI collaborate, DevOps shifts from firefighting to foresight.
While running a synthetic benchmark that pre-warmed the cache, we noticed an abnormal performance impact on Ampere CPUs. Digging deeper, we found that there were many more page faults happening with Ampere CPUs when compared to x86 CPUs. We isolated the issue to the use of certain atomic instructions like ldadd, which load a register, add a value to it, and store data in a register in a single instruction. This triggered two “page faults” under certain conditions, even though this is logically an all-or-nothing operation, which is guaranteed to be completed in one step. In this article, we will summarize how to qualify this kind of problem, how memory management in Linux works in general, explain how an atomic Arm64 instruction can generate multiple page faults, and show how to avoid performance slowdowns related to this behavior. The Problem The issue was uncovered by a synthetic benchmark that was testing the time to “warm up” the cache by executing atomic instructions to add 0 to every member of a buffer. This is not an uncommon technique to ensure that the information we care about is loaded off disk and available in memory when we first need it. For example, OpenJDK used an atomic addition instruction from Java 18 and 22 to pre-touch memory by adding 0 to the first entry in each memory page in the “heap” to ensure that it is loaded in RAM. Using profiling tools, we were able to profile the warm-up phase of the benchmark on both Ampere and x86 CPUs and see where the extra time was spent. We then used perf to see that the number of page faults was much higher on Ampere, when compared to x86, using Transparent Huge Pages and memory pre-touch. We were also able to get information on THP from the operating system. After the start-up phase, performance on Ampere was still impacted, and in the /proc/vmstat directory, we observed that on Ampere systems, the thp_split_pmd counter, which indicates the number of fragmented Huge Pages in memory, was much higher than expected. This indicated that, as part of the warm-up, Transparent Huge Pages were being fragmented, causing a performance issue. On Ampere and other Arm64 platforms since ARM 8.1-A, atomic instructions can use the Large Systems Extensions (LSE) family of atomic instructions. In Arm architectures before the introduction of LSE, atomic additions worked by using separate instructions to load a value from a register, and to check if the register had changed before saving an updated value (Load-Link/Store Conditional atomics). With LSE, Arm introduced a set of single-instruction atomic operations, including ldadd, which performs a load, add, and store operation. You would expect a single CPU instruction to generate a single page fault. However, because of how these atomic instructions are implemented on Arm64, this single instruction first generates a “read” fault, corresponding to loading the value of the register, and a separate “write” fault, corresponding to the new value being stored. As a result, this causes an excess of page faults. This is bad for several reasons: This is a significant performance hit in regular operation, because page faults and the required TLB maintenance that comes with them can significantly affect performance.When using Transparent Huge Pages (THP), this will result in huge pages being broken up. When the Huge Page is first referenced, the kernel provisions a single “huge” zero page, but as the memory is written to, this triggers a “Copy-on-Write” in the kernel for each memory page, resulting in individual memory pages being allocated, rather than the contiguous block of memory that you would expect. This means that pre-touching the memory causes 512 page faults for a 2M HugePage and 4k kernel page size, rather than the one that you expect. In addition, because the HugePage is not contiguous in memory, there is a start-up impact. Memory Management: The 10,000-Foot Overview To explain how atomic instructions can cause multiple page faults and why this has a performance impact, let’s take a brief detour into how memory management works in a CPU from a very high level. Operating systems create a virtual memory address space for programs running in user space and manage the mapping of virtual memory addresses to physical memory addresses. In addition, Linux offers a feature called Transparent Huge Pages for userspace programs, which allows those programs to reserve large contiguous blocks of physical memory and treat that memory as a single page, even though in practice these Huge Pages are made up of many contiguous physical memory pages. Figure 1. Memory management When a virtual memory address is accessed by an application, the operating system first performs a check to see if this virtual memory address is in a physical memory page that is already available in memory. The kernel does this by checking its page table, a collection of mappings from virtual memory addresses to physical memory addresses. The first step in this process is to check the Translation Lookaside Buffer (TLB), which stores a relatively small number of page table entries to accelerate access to the most used memory locations. If the address requested is not in the TLB, the CPU then checks its page table to see if the requested address is already in memory. If it is not, then a page fault is triggered. There are two types of page faults: a read fault and a write fault. For reads of uninitialized memory, the kernel can avoid accessing physical memory entirely by using a special page called the zero page, which returns zero to all load instructions. When a memory page is written to, this triggers a write fault. A location in physical memory is allocated, the mapping of that address to the virtual memory address of the application in question is stored in the page table, and this mapping is also stored in the TLB for future reference, invalidating the existing TLB entry for the zero page in the process. But why can this impact performance on Ampere CPUs in certain situations, and what exactly does that have to do with atomic instructions? To understand that, let’s dig a little deeper into how LSE atomic instructions work. Atomic Instructions on Arm64 and Page Faults Atomic instructions are a way to guarantee that a block of code will either execute completely or not at all. These instructions are necessary because multiple programs or threads can run on the same CPU at the same time. In normal operation, the operating system manages how much of the CPU’s time is allocated to each thread and which processes run in what order. What this means is that a process or thread can be put into a “wait” state in the middle of a routine, resulting in a context switch as the CPU saves its state for one process and loads the state of the next process to run. For example, imagine that we have a program that will run using threads and will be counting something. Now imagine that thread 1 reads the value of the counter, say 20, and adds 1 to it, but before it can write this new value of 21 back to the counter, thread 2 reads the old value 20 from the counter, adds 1 to it, and attempts to write its new value back to the counter. As a result, we have undercounted whatever it is that we were counting. This is the problem that we solve using atomics. On Arm64, since ARM 8.1-A, Large System Extensions (LSE) atomic instructions are used for this purpose. In a single instruction, we can read the value of a register, modify its value, and store the result. Using LSE instructions, an atomic increment routine can use a single ldadd instruction to load a register and add to it. This interacts with the memory management subsystem, because this single instruction still triggers two different operations: a load to read the old value, and a store to write the new value. The load operation will cause a “read” page fault, but since the page has not yet been referenced, the kernel will not actually allocate a page in main memory; instead, it will use a special “zero” page to save time. If we now immediately write a value to that register, we trigger a “write” page fault, which allocates some physical memory, associates it with a location in virtual memory, and then stores a value. For a single atomic instruction, we have caused two page faults, each of which has a cost. In cases where we are using 2M large Transparent Huge Pages, our initial load operation can trigger a read fault that returns a 2M large kernel zero page. However, our store operation, which triggers a write page fault, only allocates a 4K memory page. If we are pre-warming the full 2M Huge Page, this will trigger a new write fault every 4K, for 512 extra write faults. What is worse is that what should have been a contiguous 2M block of memory has become a non-contiguous collection of 4K pages in physical memory. Solving the Problem Addressing this issue really comes down to two things. First, for memory warm-up, there is an alternative mechanism that can be used to pre-touch memory. The Linux kernel provides a system call, madvise(), to allow the application developer to indicate intentions related to memory and to give advice to the kernel on how the application will use certain sections of memory. This enables the kernel to proactively use appropriate caching or read-ahead techniques to improve performance. In the case of the issue, we discovered while starting the JVM with –XX:+UseTransparentHugePages –CC:+AlwaysPreTouch, which started us on this journey, we updated the behavior of the JVM to call madvise(addr, len, MADV_POPULATE_WRITE) to indicate to the kernel that we intend to write to this area of memory. This avoids the interaction of atomic instructions and memory warm-up altogether. However, the ability to indicate MADV_POPULATE_WRITE to madvise was only added in Linux kernel 5.14, so for older versions of Linux, this is not an option. Second, we are working with the Linux kernel community to ensure that when we are using THP, a write fault on a huge zero page allocates a huge page in memory. While this issue is not yet completely resolved, the Linux kernel community is working on a patch, and we expect the issue to be fixed soon. For regular pages, the Linux kernel will continue to trigger two page faults with atomic addition instructions, by virtue of how these instructions are implemented in ARM. We believe that atomic “Read-Modify-Write” instructions should only generate one write fault, which would improve performance for these operations. We are still discussing in the kernel community the right way to accomplish this change. In our tests, making such a change halved the number of page faults during a “memory warm-up” benchmark, and reduced the time for the operation by 60% on a virtual machine running on an Ampere Altra CPU. References “Transparent Huge Page support”: https://docs.kernel.org/admin-guide/mm/transhuge.html“v5 Patch: mm: Force write fault for atomic RMW instructions”: https://lore.kernel.org/lkml/[email protected]/“Huge Zero page confusion”: https://lore.kernel.org/linux-mm/[email protected]/“pretouch_memory by atomic-add-0 fragments huge pages unexpectedly”: https://bugs.openjdk.org/browse/JDK-8272807 Check out the full Ampere article collection here.
If it’s spring, it’s usually conference time in Bucharest, Romania. This year was, as always, full of great speakers and talks. Nevertheless, Stephan Janssen’s one, where the audience met the Devoxx Genie IntelliJ plugin he has been developing, was by far my favorite. The reason I mention it is that during his presentation, I heard about Anthropic’s Model Context Protocol (MCP) for the first time. Quite late though, considering it was released last year in November. Anyway, to me, the intent of standardizing how additional context could be brought into AI applications to enrich and enhance their accuracy was basically what’s been missing from the picture. With this aspect in mind, I have been motivated enough to start studying about MCP and to experiment with how its capabilities can improve AI applications. In this direction, from high-level concepts to its practical use when integrated in AI applications, MCP has really caught my attention. The result: a series of articles. The first one — Enriching AI with Real-Time Insights via MCP — provided general insights regarding MCP and its architecture. It also exemplified how Claude Desktop can leverage it to gain access to real-time web search only via configuration and a dedicated MCP server plug-in. The second — Turn SQL into Conversation: Natural Language Database Queries With MCP — showed how PostgreSQL MCP Server can access private databases and enable LLMs to inspect them and offer useful pieces of information. Very little to no code was written in these two, and still ,the outcome obtained was quite promising. In the third article, How to Build an MCP Server with Java SDK, an identical database to the one in the previous article was used, but this time the MCP server was developed from scratch, using only the Java SDK. Basically, it exposed several tools that allowed the AI assistant to access an external system via MCP over stdio transport and fetch accurate context as needed. This article, the fourth, is the most complex one in the series in terms of the code written, as it exemplifies an end-to-end use case, and the components are developed from scratch: An MCP server that connects to a PostgreSQL database and exposes tools that deliver pieces of information to a peer MCP clientAn AI chat client integrated with OpenAI, which, by enclosing an MCP client, allows enriching the context with data provided by the MCP Server Both are web applications, developed using Spring Boot, Spring AI, and Spring AI MCP. In terms of MCP, the transport layer, the entity responsible for handling the communication between the client and the server is HTTP and Server-Sent Events (HTTP + SSE), which holds a stateful 1:1 connection between the two. Concerning the actual data that enriches the context, it resides in a simple PostgreSQL database schema. The access is accomplished via the great and lightweight asentinel-orm open-source tool. Being built on top of Spring JDBC and possessing most of the features of a basic ORM, it fits nicely into the client application. Use Case Working in the domain of telecom expense management, the experiment in this article uses data that models telecom invoices. Let’s assume a user can access a database that contains a simple schema with data related to these. The goal is to use the AI chat client and ask the LLM to compile several key insights about particular invoices, insights that may be further useful when compiling business decisions. Considering the PostgreSQL database server is up and running, one may create this simple schema. SQL create schema mcptelecom; Everything is simplified, so it’s easier to follow. There is only one entity — Invoice — while its attributes are descriptive and straightforward. The database initialization can be done with the script below. SQL drop table if exists invoices cascade; create table if not exists invoices ( id serial primary key, number varchar not null unique, date date not null, vendor varchar not null, service varchar not null, status varchar not null, amount numeric(18, 2) default 0.0 not null ); Although not much, the following experimental data is more than enough for the exemplification here; nevertheless, one may add more or make modifications, as appropriate. SQL insert into invoices (number, date, vendor, service, status, amount) values ('vdf-voip-7', '2025-07-01', 'VODAFONE', 'VOIP', 'REVIEWED', 157.50); insert into invoices (number, date, vendor, service, status, amount) values ('vdf-int-7', '2025-07-01', 'VODAFONE', 'INTERNET', 'PAID', 23.50); insert into invoices (number, date, vendor, service, status, amount) values ('org-voip-7', '2025-07-01', 'ORANGE', 'VOIP', 'APPROVED', 146.60); insert into invoices (number, date, vendor, service, status, amount) values ('org-int-7', '2025-07-01', 'ORANGE', 'INTERNET', 'PAID', 30.50); insert into invoices (number, date, vendor, service, status, amount) values ('vdf-voip-8', '2025-08-01', 'VODAFONE', 'VOIP', 'PAID', 135.50); insert into invoices (number, date, vendor, service, status, amount) values ('vdf-int-8', '2025-08-01', 'VODAFONE', 'INTERNET', 'APPROVED', 15.50); insert into invoices (number, date, vendor, service, status, amount) values ('org-voip-8', '2025-08-01', 'ORANGE', 'VOIP', 'REVIEWED', 147.60); insert into invoices (number, date, vendor, service, status, amount) values ('org-int-8', '2025-08-01', 'ORANGE', 'INTERNET', 'PAID', 14.50); Briefly, there are invoices from two vendors, from July and August 2025, on two different services —VOIP and Internet — some of them still under review, others approved or paid. There is no doubt that without “connecting” the OpenAI LLM with the private database, the assistant cannot be of much help, as it has no knowledge about the particular invoices. As previously stated, the goal is to put these in relation via MCP, more precisely by developing an MCP server that exposes tools through which the LLM knowledge could be enriched. Then, the MCP server is to be used by the MCP client as needed. According to the documentation [Resource 3], “Java SDK for MCP enables standardized integration between AI models and tools.” This is exactly what’s aimed here. Developing the MCP Server The purpose is to develop an MCP Server that can read pieces of information about the invoices located in the PostgreSQL database. Once available, the server is checked using the MCP Inspector [Resource 2], a very useful tool for testing or debugging such components. Eventually, the MCP Server is used by the MCP Client described in the next section via HTTP+SSE. The server project set-up is the following: Java 21Maven 3.9.9Spring Boot – 3.5.3Spring AI – v. 1.0.0PostgreSQL Driver – v. 42.7.7Asentinel ORM – v. 1.71.0 The project is named mcp-sb-server and to be sure of the recommended spring dependencies used, the spring-ai-bom configured in the pom.xml file. XML <dependencyManagement> <dependencies> <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-bom</artifactId> <version>${spring-ai.version}</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> The leading dependency of this project is the Spring AI MCP Server Boot Starter that comes with the well-known and convenient capability of automatic components’ configuration, which easily allows setting up an MCP server in Spring Boot applications. XML <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-starter-mcp-server-webmvc</artifactId> </dependency> As the communication is over HTTP, the WebMVC server transport is used. The starter activates McpWebMvcServerAutoConfiguration and provides HTTP-based transport using Spring MVC and automatically configured SSE endpoints. It also brings into the picture an optional STDIO transport (through McpServerAutoConfiguration) which can be enabled or disabled via the spring.ai.mcp.server.stdio property, nevertheless, it will not be used here. In order to be able to read the PostgreSQL database schema, the designated postgresql dependency is added, together with the ORM tool. spring-boot-starter-jdbc is present to ensure the automatic DataSource configuration. XML <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-jdbc</artifactId> </dependency> <dependency> <groupId>com.asentinel.common</groupId> <artifactId>asentinel-common</artifactId> <version>1.71.0</version> </dependency> <dependency> <groupId>org.postgresql</groupId> <artifactId>postgresql</artifactId> </dependency> As I already mentioned in a previous article, I see the MCP servers’ implementation split very clearly into two sections. The former is an MCP-specific one that is pretty similar irrespective of the particular tools’ details, while the latter focuses on the actual functionality that is almost unrelated to MCP. To configure the MCP Server, a few properties prefixed by spring.ai.mcp.server are added into the application.properties file. Let’s take them in order. Properties files spring.ai.mcp.server.name=mcp-invoice-server spring.ai.mcp.server.type=sync spring.ai.mcp.server.version=1.0.0 spring.ai.mcp.server.instructions=Instructions - SSE endpoint: /mcp/invoices/sse, SSE message endpoint: /mcp/invoices/messages spring.ai.mcp.server.sse-message-endpoint=/mcp/invoices/messages spring.ai.mcp.server.sse-endpoint=/mcp/invoices/sse spring.ai.mcp.server.capabilities.tool=true spring.ai.mcp.server.capabilities.completion=false spring.ai.mcp.server.capabilities.prompt=false spring.ai.mcp.server.capabilities.resource=false In addition to the server’s name and type, which are obvious, the ones that designate the version and the instructions are pretty important. The version of the instance is sent to clients and used for compatibility checks, while the instructions property provides guidance upon initialization and allows clients to get hints on how to utilize the server. spring.ai.mcp.server.sse-message-endpoint is the endpoint path for Server-Sent Events (SSE) when using web transports, while spring.ai.mcp.server.sse-endpoint is the one the MCP client will use as the communication endpoint. Later in the article, we will see how an HTTP-based session for sending messages is created and how async responses are processed while sending POST JSON requests. The last four properties in the above snippet define the server capabilities. Here, only tools are exposed. Next, a ToolCallbackProvider is registered, which communicates to Spring AI the beans that are exposed as MCP services. A MethodToolCallbackProvider implementation is configured, which builds instances from @Tool annotated methods. Java @Configuration public class McpConfig { @Bean public ToolCallbackProvider toolCallbackProvider(InvoiceTools invoiceTools) { return MethodToolCallbackProvider.builder() .toolObjects(invoiceTools) .build(); } } The tools’ configuration is further implemented in the component below. Java @Component public class InvoiceTools { private final InvoiceService invoiceService; public InvoiceTools(InvoiceService invoiceService) { this.invoiceService = invoiceService; } @Tool(name = "get-invoices-by-pattern", description = "Filters invoices by the provided pattern") public List<Invoice> invoicesBy(@ToolParam(description = "The pattern looked up when filtering") String pattern) { return invoiceService.findByPattern(pattern); } } We specify the name and the description of the tool, together with its parameters, if any. The tool in this MCP server is very simple; it returns invoices by a pattern that is looked up in their number attribute. The result is a list of database entities that are further sent to the client application and used by the LLM to have a better view of the context. With this, the MCP server-specific implementation is completed. Although straightforward with only a few lines of code needed, conceptually, there is quite a lot to cover and configure. Spring in general and Spring AI MCP in particular seem magical (it actually is!), but a thorough understanding of the concepts is needed once the enthusiasm passes, so that the developed applications are robust enough and ready for production. In order to complete the implementation and Invoice entities delivered to clients via MCP, an InvoiceService is developed. Data source properties are set in application.properties file. Properties files spring.datasource.url=jdbc:postgresql://localhost:5432/postgres?currentSchema=mcptelecom spring.datasource.username=${POSTGRES_USER} spring.datasource.password=${POSTGRES_PASSWORD} The Invoice entities are mapped over the aforementioned Invoices table and modeled as below. Java import com.asentinel.common.orm.mappers.Column; import com.asentinel.common.orm.mappers.PkColumn; import com.asentinel.common.orm.mappers.Table; @Table("Invoices") public class Invoice { public static final String COL_NUMBER = "number"; @PkColumn("id") private int id; @Column(value = COL_NUMBER) private String number; @Column("date") private LocalDate date; @Column("vendor") private Vendor vendor; @Column("service") private Service service; @Column("status") private Status status; @Column("amount") private double amount; public enum Vendor { VODAFONE, ORANGE } public enum Service { VOIP, INTERNET } public enum Status { REVIEWED, APPROVED, PAID } ... } InvoiceService declares a single method, the one invoked above by the get-invoices-by-pattern tool. Java import com.asentinel.common.orm.OrmOperations; @Service public class InvoiceService { private final OrmOperations orm; public InvoiceService(OrmOperations orm) { this.orm = orm; } public List<Invoice> findByPattern(String pattern) { return orm.newSqlBuilder(Invoice.class) .select() .where() .column(Invoice.COL_NUMBER).like('%' + pattern + '%') .exec(); } } Ultimately, in order to use an OrmOperations instance and inject it into the service, it shall first be configured. Java @Configuration public class OrmConfig { @Bean public JdbcFlavor jdbcFlavor() { return new PostgresJdbcFlavor(); } @Bean public JdbcOperations jdbcOperations(DataSource dataSource, JdbcFlavor jdbcFlavor) { PgEchoingJdbcTemplate template = new PgEchoingJdbcTemplate(dataSource); template.setJdbcFlavor(jdbcFlavor); return template; } @Bean public SqlQuery sqlQuery(JdbcFlavor jdbcFlavor, JdbcOperations jdbcOps) { return new SqlQueryTemplate(jdbcFlavor, jdbcOps); } @Bean public SqlFactory sqlFactory(JdbcFlavor jdbcFlavor) { return new DefaultSqlFactory(jdbcFlavor); } @Bean public DefaultEntityDescriptorTreeRepository entityDescriptorTreeRepository(SqlBuilderFactory sqlBuilderFactory) { DefaultEntityDescriptorTreeRepository treeRepository = new DefaultEntityDescriptorTreeRepository(); treeRepository.setSqlBuilderFactory(sqlBuilderFactory); return treeRepository; } @Bean public DefaultSqlBuilderFactory sqlBuilderFactory(@Lazy EntityDescriptorTreeRepository entityDescriptorTreeRepository, SqlFactory sqlFactory, SqlQuery sqlQuery) { DefaultSqlBuilderFactory sqlBuilderFactory = new DefaultSqlBuilderFactory(sqlFactory, sqlQuery); sqlBuilderFactory.setEntityDescriptorTreeRepository(entityDescriptorTreeRepository); return sqlBuilderFactory; } @Bean public OrmOperations orm(SqlBuilderFactory sqlBuilderFactory, JdbcFlavor jdbcFlavor, SqlQuery sqlQuery) { return new OrmTemplate(sqlBuilderFactory, new SimpleUpdater(jdbcFlavor, sqlQuery)); } } Although quite verbose at first glance, for a more thorough understanding, one might explore the configuration above in detail by referring to asentinel-orm project [Resource 5]. To check that the data is successfully retrieved from the database in accordance with the particular use case, the following simple test is run. Java @SpringBootTest class InvoiceServiceTest { @Autowired private InvoiceService invoiceService; @Test void findByPattern() { var pattern = "voip"; List<Invoice> invoices = invoiceService.findByPattern(pattern); Assertions.assertTrue(invoices.stream() .allMatch(i -> i.getNumber().contains(pattern))); } } At this point, the mcp-sb-server implementation is finished and ready to be run on port 8081. Testing the MCP Server With MCP Inspector As already stated, MCP Inspector is an excellent tool for testing and debugging MCP servers. Its documentation clearly describes the needed prerequisites to run it and provides details on the available configurations. It can be started with the following command. PowerShell C:\Users\horatiu.dan>npx @modelcontextprotocol/inspector Starting MCP inspector... Proxy server listening on localhost:6277 Session token: 3c672c3389d66786f32ffe2f90d6d2116634bef316a09198fb6e933a5eeefe2b Use this token to authenticate requests or set DANGEROUSLY_OMIT_AUTH=true to disable auth MCP Inspector is up and running at: http://localhost:6274/?MCP_PROXY_AUTH_TOKEN=3c672c3389d66786f32ffe2f90d6d2116634bef316a09198fb6e933a5eeefe2b Once the MCP Inspector is up and running, it can be accessed using the above link. Prior to connecting to the developed MCP server, though, there are some prerequisites: Transport Type: SSEURL: http://localhost:8081/mcp/invoices/sse Once successfully connected, one can observe the following in the mcp-sb-server logs, which means the session has been created. Plain Text [mcp-sb-server] [nio-8081-exec-2] i.m.s.t.WebMvcSseServerTransportProvider : Creating new SSE connection for session: ffdd13e8-ad1f-4e5d-9c0a-ad001d4081f6 [mcp-sb-server] [nio-8081-exec-2] i.m.s.t.WebMvcSseServerTransportProvider : Session transport ffdd13e8-ad1f-4e5d-9c0a-ad001d4081f6 initialized with SSE builder [mcp-sb-server] [nio-8081-exec-5] i.m.server.McpAsyncServer : Client initialize request - Protocol: 2025-06-18, Capabilities: ClientCapabilities[experimental=null, roots=RootCapabilities[listChanged=true], sampling=Sampling[]], Info: Implementation[name=mcp-inspector, version=0.16.2] [mcp-sb-server] [nio-8081-exec-5] i.m.s.t.WebMvcSseServerTransportProvider : Message sent to session ffdd13e8-ad1f-4e5d-9c0a-ad001d4081f6 Next, the tools can be listed and also tried out. The picture below exemplifies the execution of get-invoices-by-pattern tool, which returns two invoices when the provided pattern is voip-7. Testing the MCP Server Usage via SSE and JSON-RPC Over HTTP At the beginning of the MCP server implementation, the SSE endpoints were set in the application.properties file. Properties files spring.ai.mcp.server.sse-message-endpoint=/mcp/invoices/messages spring.ai.mcp.server.sse-endpoint=/mcp/invoices/sse When sending HTTP POST requests, the server uses Server-Sent Events for session-based communication, and asynchronous responses may be observed in the browser, once the session exists. With the MCP server running, a session is created by invoking the designated endpoint from the browser. Plain Text http://localhost:8081/mcp/invoices/sse In the server logs, the following lines appear: Plain Text [mcp-sb-server] [nio-8081-exec-7] i.m.s.t.WebMvcSseServerTransportProvider : Creating new SSE connection for session: 7bd2d41c-8643-41e6-9aa3-0b2a3e4496b4 [mcp-sb-server] [nio-8081-exec-7] i.m.s.t.WebMvcSseServerTransportProvider : Session transport 7bd2d41c-8643-41e6-9aa3-0b2a3e4496b4 initialized with SSE builder And the response is shown in the browser: Plain Text id:7bd2d41c-8643-41e6-9aa3-0b2a3e4496b4 event:endpoint data:/mcp/invoices/messages?sessionId=7bd2d41c-8643-41e6-9aa3-0b2a3e4496b4 One may observe that data is populated with exactly the endpoint configured above, together with the sessionId request parameter. If sending an initialization request: Plain Text POST http://localhost:8081/mcp/invoices/messages?sessionId=7bd2d41c-8643-41e6-9aa3-0b2a3e4496b4 Accept: application/json { "jsonrpc": "2.0", "id": 0, "method": "initialize", "params": { "protocolVersion": "2024-11-05", "clientInfo": { "name": "Exploratory MCP Client", "version": "1.0.0" } } } The server logs display: Plain Text [mcp-sb-server] [io-8081-exec-10] i.m.server.McpAsyncServer : Client initialize request - Protocol: 2024-11-05, Capabilities: null, Info: Implementation[name=Exploratory MCP Client, version=1.0.0] [mcp-sb-server] [io-8081-exec-10] i.m.s.t.WebMvcSseServerTransportProvider : Message sent to session 7bd2d41c-8643-41e6-9aa3-0b2a3e4496b4 and a new message appears in the browser window: Plain Text id:7bd2d41c-8643-41e6-9aa3-0b2a3e4496b4 event:message data:{"jsonrpc":"2.0","id":0,"result":{"protocolVersion":"2024-11-05","capabilities":{"logging":{},"tools":{"listChanged":true},"serverInfo":{"name":"mcp-invoice-server","version":"1.0.0"},"instructions":"Instructions - SSE endpoint: /mcp/invoices/sse, SSE message endpoint: /mcp/invoices/messages"} To initialize the notifications retrieval, execute: Plain Text POST http://localhost:8081/mcp/invoices/messages?sessionId=7bd2d41c-8643-41e6-9aa3-0b2a3e4496b4 Accept: application/json { "jsonrpc": "2.0", "method": "notifications/initialized" } To list the available tools, execute: Plain Text POST http://localhost:8081/mcp/invoices/messages?sessionId=7bd2d41c-8643-41e6-9aa3-0b2a3e4496b4 Accept: application/json { "jsonrpc": "2.0", "id": "1", "method": "tools/list", "params": {} } and observe the response in the browser: Plain Text id:7bd2d41c-8643-41e6-9aa3-0b2a3e4496b4 event:message data:{"jsonrpc":"2.0","id":"2","result":{"tools":[{"name":"get-invoices-by-pattern","description":"Filters invoices by the provided pattern","inputSchema":{"type":"object","properties":{"pattern":{"type":"string","description":"The pattern looked up when filtering"},"required":["pattern"],"additionalProperties":false}]} To invoke the get-invoices-by-pattern tool, execute: Plain Text POST http://localhost:8081/mcp/invoices/messages?sessionId=7bd2d41c-8643-41e6-9aa3-0b2a3e4496b4 Accept: application/json { "jsonrpc": "2.0", "id": "2", "method": "tools/call", "params": { "name": "get-invoices-by-pattern", "arguments": { "pattern": "voip-7" } } } and observe the response in the browser: Plain Text id:7bd2d41c-8643-41e6-9aa3-0b2a3e4496b4 event:message data:{"jsonrpc":"2.0","id":"2","result":{"content":[{"type":"text","text":"[{\"id\":1,\"number\":\"vdf-voip-7\",\"date\":[2025,7,1],\"vendor\":\"VODAFONE\",\"service\":\"VOIP\",\"status\":\"REVIEWED\",\"amount\":157.5},{\"id\":3,\"number\":\"org-voip-7\",\"date\":[2025,7,1],\"vendor\":\"ORANGE\",\"service\":\"VOIP\",\"status\":\"APPROVED\",\"amount\":146.6}]"}],"isError":false} Obviously, the same result as in the case of the MCP Inspector is obtained. The conclusion is that the developed MCP server is tested and works as expected; it can now be actually used. Developing the AI Chat Client It’s a minimal web application that leverages Spring AI and allows users to communicate with OpenAI. The functionality here is straightforward; however, the emphasis is on the LLM response quality within a very specific scenario, as the MCP server is plugged in. Let’s imagine a user is interested in finding out a few key insights about certain telecom invoices (for example, whose invoice numbers contain a pattern) and, moreover, restricted to a particular month of the year. Could the LLM compile such pieces of information on its own? Not quite, but that’s the reason the previous MCP server was developed: to make this context available to the LLM so that it can take it from there. Briefly, data flows in the following manner: The user issues a parameterized HTTP request – GET /assistant/invoice-insights?month=July&year=2025&pattern=vdf.The client application uses a prompt to create the previously mentioned input: "Give me some key insights about the invoices that contain ‘{pattern}’ in their number. If available, use year {year} and month {month} when analyzing."The client application sends it to the LLM, which provides its output that is further returned to the client as a response. The client project set-up is the following: Java 21Maven 3.9.9Spring Boot 3.5.3Spring AI 1.0.0 The project is named mcp-sb-client and as in the case of the server the spring-ai-bom configured in the pom.xml file. In addition to spring-ai-starter-model-openai dependency, the one of interest here is XML <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-starter-mcp-client</artifactId> </dependency> which allows connecting to the MCP server via stdio or SSE transports. As the communication is done over HTTP, the latter is considered. The SSE connection uses the HttpClient transport implementation and for every connection to an MCP server, a new MCP client instance is created. To configure the MCP client, a few properties prefixed by spring.ai.mcp.client can be added to the application.properties file. Additionally, the following property indicates the base URL of the MCP server the client connects to, needed when constructing the HttpClientSseClientTransport. Properties files mcp.invoices.server.base-url = http://localhost:8081 As already stated, this proof of concept uses OpenAI. In order to be able to connect to the AI model, a valid API key of the user on behalf of which the communication is made should be set in the application.properties file [Resource 1]. To set a bit of the LLM boundaries and not make it too creative, the temperature parameter is configured as well. Properties files spring.ai.openai.api-key = ${OPEN_AI_API_KEY} spring.ai.openai.chat.options.temperature = 0.3 On my machine, the key is held in the designated environment variable and used when appropriate. The interaction between the user and the LLM is doable via a ChatClient, that is injected into the below AssistantService. Java @Service public class AssistantService { private final ChatClient client; public AssistantService(ChatClient.Builder clientBuilder, McpSyncClient mcpSyncClient) { client = clientBuilder .defaultSystem("You are a helpful assistant with Telecom knowledge") .defaultToolCallbacks(new SyncMcpToolCallbackProvider(mcpSyncClient)) .build(); } public String invoiceInsights(String month, String year, String pattern) { final String text = """ Give me some key insights about the invoices that contain '{pattern}' in their number? If available, use year {year} and month {month} when analyzing. """; return client.prompt() .user(userSpec -> userSpec.text(text) .param("month", month) .param("year", year) .param("pattern", pattern)) .call() .content(); } } The focus is not on how the ChatClient is used. If interested in the details, have a look at this article. The focus here is on the McpSyncClient that is packaged into a SyncMcpToolCallbackProvider instance and used when the ChatClient is build. According to the JavaDoc, a SyncMcpToolCallbackProvider has the purpose of discovering MCP tools from one or more MCP servers. It is basically the Spring AI server tool provider. Very briefly, it connects to the MCP server via a sync client, it lists and reads the available exposed server tools (one in the case of our MCP server), and creates a SyncMcpToolCallback for each. The SyncMcpToolCallback actually connects the MCP tool to Spring AI’s tool system and allows it to be executed seamlessly inside Spring AI applications. Obviously, tool calls are handled through the MCP client, which is configured as follows. Java @Configuration public class McpConfig { @Bean public McpSyncClient mcpSyncClient(@Value("${mcp.invoices.server.base-url}") String baseUrl) { var transport = HttpClientSseClientTransport.builder(baseUrl) .sseEndpoint("mcp/invoices/sse") .build(); McpSyncClient client = McpClient.sync(transport) .requestTimeout(Duration.ofSeconds(10)) .clientInfo(new McpSchema.Implementation("MCP Invoices Client", "1.0.0")) .build(); client.initialize(); return client; } } As the client application is a web one as well (running on port 8080), the AssistantService is plugged into a @RestController, to easily interact with it. Java @RestController public class AssistantController { private final AssistantService assistantService; public AssistantController(AssistantService assistantService) { this.assistantService = assistantService; } @GetMapping("/invoice-insights") public ResponseEntity<String> invoicesInsights(@RequestParam(defaultValue = "") String month, @RequestParam(defaultValue = "") String year, @RequestParam String pattern) { return ResponseEntity.ok(assistantService.invoiceInsights(month, year, pattern)); } } The MCP client and server are now connected; the outcome may be examined. The Results To use the integration, one may first run the MCP server. When the MCP Client starts as well, the following lines appear in the logs. Plain Text 2025-08-05T16:40:10.891+03:00 DEBUG 56796 --- [mcp-sb-server] [nio-8081-exec-1] i.m.s.t.WebMvcSseServerTransportProvider : Creating new SSE connection for session: c0eee498-ffdf-4f4e-bfbd-d5b683154e9b 2025-08-05T16:40:10.901+03:00 DEBUG 56796 --- [mcp-sb-server] [nio-8081-exec-1] i.m.s.t.WebMvcSseServerTransportProvider : Session transport c0eee498-ffdf-4f4e-bfbd-d5b683154e9b initialized with SSE builder 2025-08-05T16:40:10.994+03:00 INFO 56796 --- [mcp-sb-server] [nio-8081-exec-2] i.m.server.McpAsyncServer : Client initialize request - Protocol: 2024-11-05, Capabilities: ClientCapabilities[experimental=null, roots=null, sampling=null], Info: Implementation[name=MCP Invoices Client, version=1.0.0] 2025-08-05T16:40:11.006+03:00 DEBUG 56796 --- [mcp-sb-server] [nio-8081-exec-2] i.m.s.t.WebMvcSseServerTransportProvider : Message sent to session c0eee498-ffdf-4f4e-bfbd-d5b683154e9b Plain Text 2025-08-05T16:40:11.053+03:00 INFO 11104 --- [mcp-sb-client] [ient-3-Worker-2] i.m.client.McpAsyncClient : Server response with Protocol: 2024-11-05, Capabilities: ServerCapabilities[completions=null, experimental=null, logging=LoggingCapabilities[], prompts=null, resources=null, tools=ToolCapabilities[listChanged=true]], Info: Implementation[name=mcp-invoice-server, version=1.0.0] and Instructions Instructions - SSE endpoint: /mcp/invoices/sse, SSE message endpoint: /mcp/invoices/messages which demonstrates the connection between the two was successfully initialized. Let’s observe what happens when the user sends the next request to the client application, which means is interested in insights on the invoices from 2025, but only the ones having the vdf pattern contained in their number. To refresh our memory, there are four such invoices in the database. Plain Text GET http://localhost:8080/assistant/invoice-insights?year=2025&pattern=vdf The response obtained is quite interesting: Plain Text Here are the key insights about the invoices that contain 'vdf' in their number for the year 2025: 1. **Total Invoices**: There are 4 invoices that match the pattern 'vdf'. 2. **Invoice Details**: - **Invoice 1**: - **Number**: vdf-voip-7 - **Date**: July 1, 2025 - **Vendor**: VODAFONE - **Service**: VOIP - **Status**: REVIEWED - **Amount**: $157.50 - **Invoice 2**: - **Number**: vdf-int-7 - **Date**: July 1, 2025 - **Vendor**: VODAFONE - **Service**: INTERNET - **Status**: PAID - **Amount**: $23.50 - **Invoice 3**: - **Number**: vdf-voip-8 - **Date**: August 1, 2025 - **Vendor**: VODAFONE - **Service**: VOIP - **Status**: PAID - **Amount**: $135.50 - **Invoice 4**: - **Number**: vdf-int-8 - **Date**: August 1, 2025 - **Vendor**: VODAFONE - **Service**: INTERNET - **Status**: APPROVED - **Amount**: $15.50 3. **Total Amount**: The total amount for these invoices is $332.00. 4. **Status Overview**: - 2 invoices are marked as PAID. - 1 invoice is marked as REVIEWED. - 1 invoice is marked as APPROVED. 5. **Service Breakdown**: - VOIP Services: 2 invoices totaling $293.00. - INTERNET Services: 2 invoices totaling $39.00. These insights provide a clear overview of the invoices associated with 'vdf', highlighting the amounts, statuses, and service types. If recalling from the previous section, get-invoices-by-pattern MCP server tool filters invoices only by pattern, which means the context provided to the OpenAI LLM was ‘enlarged’ with the corresponding invoices. From this point on, it’s the model’s job to add its contribution to the requested analysis. As one may observe, it managed to come up with a decent solution to the user’s enquiry. Nevertheless, without the use of the MCP server, the context would have been almost empty; thus, no conclusions could have been drawn about the invoices in discussion. Final Considerations The applications presented in this article illustrate how users can benefit from a cohesive system where individual components integrate seamlessly to deliver meaningful results. With the use of MCP, more exactly by creating a 1:1 stateful connection over HTTP between the MCP server and the MCP client, data from a private database was put into context to help the OpenAI LLM provide the user with business insights related to telecom invoices. Indeed, this is a simplistic use case, focused on a single specific tool intended to illustrate the purpose. Nevertheless, one may use it as a starting point to explore how MCP works, to understand its architecture, and how MCP can enhance AI applications by leveraging its functionalities. One last aspect, though, concerns the idea of putting such services into a production environment, a place that programmers really appreciate and value. As these are Spring Web applications, they can be easily secured using Spring Security. In terms of scalability, although HTTP requests towards LLMs and databases use blocking IO, the thread usage can be significantly improved by leveraging Java 21’s virtual threads. Regarding the observability, all requests to an LLM do cost, but fortunately, Spring Boot Actuator can be quickly plugged in to monitor the metrics related to the actual token consumption, and the resource consumption can be optimized. Resources Open AI PlatformMCP InspectorMCP Java SDK DocumentationSpring AI MCP Referenceasentinel-orm projectMCP Invoice Server codeMCP Invoice Client codeThe picture was taken at Cochem Castle in Germany.
Hey, DZone Community! We have an exciting year of research ahead for our beloved Trend Reports. And once again, we are asking for your insights and expertise (anonymously if you wish) — readers just like you drive the content we cover in our Trend Reports. Check out the details for our research survey below. Database Systems Research With databases powering nearly every modern application nowadays, how are developers and organizations utilizing, managing, and evolving these systems — across usage, architecture, operations, security, and emerging trends like AI and real-time analytics? Take our short research survey (~10 minutes) to contribute to our upcoming Trend Report. Oh, and did we mention that anyone who takes the survey could be one of the lucky four to win an e-gift card of their choosing? We're diving into key topics such as: The databases and query languages developers rely onExperiences and challenges with cloud migrationPractices and tools for data security and observabilityData processing architectures and the role of real-time analyticsEmerging approaches like vector and AI-assisted databases Join the Database Systems Research Over the coming month, we will compile and analyze data from hundreds of respondents; results and observations will be featured in the "Key Research Findings" of our upcoming Trend Report. Your responses help inform the narrative of our Trend Reports, so we truly cannot do this without you. Stay tuned for each report's launch and see how your insights align with the larger DZone Community. We thank you in advance for your help! —The DZone Content and Community team
When it comes to software development, one of the biggest mistakes is delivering precisely what the client wants. While this may sound cliché, the problem persists even after decades in the industry. A more effective approach is to begin testing with a focus on business needs. Behavior-driven development (BDD) is a software development methodology that emphasizes behavior and domain terminology, also known as ubiquitous language. It uses a shared, natural language to define and test software behaviors from the user's perspective. BDD builds on test-driven development (TDD) by concentrating on scenarios that are relevant to the business. These scenarios are written as plain-language specifications that can be automated into tests, which also serve as living documentation. This approach promotes a common understanding between both technical and non-technical stakeholders, ensures that the software meets user needs, and helps reduce rework and development time. In this article, we will explore this methodology further and discuss how to implement it using Oracle NoSQL and Java. How BDD and DDD Work Together At first glance, behavior-driven development (BDD) and domain-driven design (DDD) may appear to address different problems — one focusing on testing and the other on modeling. Yet, both share the same philosophical foundation: ensuring that software truly reflects the business domain it serves. DDD, introduced by Eric Evans in his seminal 2003 book Domain-Driven Design: Tackling Complexity in the Heart of Software, teaches us to model software around business concepts — entities, value objects, aggregates, and bounded contexts. Its power lies in the use of ubiquitous language, a shared vocabulary that unites developers and domain experts. BDD, coined a few years later by Dan North, emerged as a natural extension of this idea. It brought ubiquitous language into the testing process, turning business rules into executable specifications. Where DDD defines what the system should represent, BDD validates how the system behaves according to that model. When used together, DDD and BDD form a continuous feedback loop: DDD shapes the domain model that captures the business logic.BDD ensures that the system behavior stays consistent with that model over time. In practice, this synergy means you can write feature scenarios — such as “When I reserve a VIP room, the system should mark it as unavailable” — directly tied to aggregates like Room and Reservation. These tests become living documentation for both developers and stakeholders, ensuring your domain remains aligned with real business needs. If you want to explore this alignment in depth, my book Domain-Driven Design with Java expands on these principles. It shows how to apply DDD patterns in modern Java applications using Jakarta EE, Spring, and cloud technologies, offering a practical foundation for uniting architecture and behavior. Together, DDD and BDD close the gap between understanding the business and proving it works — transforming software from a technical artifact into a faithful expression of the domain itself. Show Me the Code In this sample, we’ll generate a simple hotel management application using Enterprise Java and the Oracle NoSQL Database. The first step is to create the project. Since we’re working with Java SE, we can generate it using the following Maven command: Shell mvn archetype:generate \ "-DarchetypeGroupId=io.cucumber" \ "-DarchetypeArtifactId=cucumber-archetype" \ "-DarchetypeVersion=7.30.0" \ "-DgroupId=org.soujava.demos.hotel" \ "-DartifactId=behavior-driven-development" \ "-Dpackage=org.soujava.demos" \ "-Dversion=1.0.0-SNAPSHOT" \ "-DinteractiveMode=false" The next step is to include Eclipse JNoSQL with Oracle NoSQL, along with the Jakarta EE component implementations: CDI, JSON, and the Eclipse MicroProfile implementation. You can find the complete pom.xml file. With the initial project ready, we’ll start by creating the tests. Remember, BDD is an extension of TDD that includes the ubiquitous language — the shared vocabulary between the domain and the business. Textile Feature: Manage hotel rooms Scenario: Register a new room Given the hotel management system is operational When I register a room with number 203 Then the room with number 203 should appear in the room list Scenario: Register multiple rooms Given the hotel management system is operational When I register the following rooms: | number | type | status | cleanStatus | | 101 | STANDARD | AVAILABLE | CLEAN | | 102 | SUITE | RESERVED | DIRTY | | 103 | VIP_SUITE | UNDER_MAINTENANCE | CLEAN | Then there should be 3 rooms available in the system Scenario: Change room status Given the hotel management system is operational And a room with number 101 is registered as AVAILABLE When I mark the room 101 as OUT_OF_SERVICE Then the room 101 should be marked as OUT_OF_SERVICE With the maven project completed, let's move to the next step, which is creating the modeling and the repository. As mentioned before, we'll focus on room management. Therefore, our next goal is to ensure that the previously defined BDD tests pass. Let's start by implementing the domain model and the repository: Java public enum CleanStatus { CLEAN, DIRTY, INSPECTION_NEEDED } public enum RoomStatus { AVAILABLE, RESERVED, UNDER_MAINTENANCE, OUT_OF_SERVICE } public enum RoomType { STANDARD, DELUXE, SUITE, VIP_SUITE } @Entity public class Room { @Id private String id; @Column private int number; @Column private RoomType type; @Column private RoomStatus status; @Column private CleanStatus cleanStatus; @Column private boolean smokingAllowed; @Column private boolean underMaintenance; } With the model, the next step is to create the bridge between Enterprise Java and Oracle NoSQL as a non-relational database. We can do it super easily with Jakarta Data, which has a single repository, so we don't need to worry about the implementation. Java @Repository public interface RoomRepository { @Query("FROM Room") List<Room> findAll(); @Save Room save(Room room); void deleteBy(); Optional<Room> findByNumber(Integer number); } With the project completed, the next step is to prepare the test environment, starting by making a database instance available for testing. Thanks to Testcontainers, we can easily spin up an isolated instance of Oracle NoSQL to run our tests. Java public enum DatabaseContainer { INSTANCE; private final GenericContainer<?> container = new GenericContainer<> (DockerImageName.parse("ghcr.io/oracle/nosql:latest-ce")) .withExposedPorts(8080); { container.start(); } public DatabaseManager get(String database) { DatabaseManagerFactory factory = managerFactory(); return factory.apply(database); } public DatabaseManagerFactory managerFactory() { var configuration = DatabaseConfiguration.getConfiguration(); Settings settings = Settings.builder() .put(OracleNoSQLConfigurations.HOST, host()) .build(); return configuration.apply(settings); } public String host() { return "http://" + container.getHost() + ":" + container.getFirstMappedPort(); } } After that, we’ll create a producer integrated with the @Alternative CDI annotation. This configuration teaches CDI how to provide the database instance — in this case, the one managed by Testcontainers: Java @ApplicationScoped @Alternative @Priority(Interceptor.Priority.APPLICATION) public class ManagerSupplier implements Supplier<DatabaseManager> { @Produces @Database(DatabaseType.DOCUMENT) @Default public DatabaseManager get() { return DatabaseContainer.INSTANCE.get("hotel"); } } With Cucumber, we can define an ObjectFactory that injects classes into the Cucumber test context. Since we’re using CDI with Weld as the implementation, we’ll create a custom WeldCucumberObjectFactory to integrate both technologies seamlessly. Java public class WeldCucumberObjectFactory implements ObjectFactory { private Weld weld; private WeldContainer container; @Override public void start() { weld = new Weld(); container = weld.initialize(); } @Override public void stop() { if (weld != null) { weld.shutdown(); } } @Override public boolean addClass(Class<?> stepClass) { return true; } @Override public <T> T getInstance(Class<T> type) { return (T) container.select(type).get(); } } One important note: this setup works as an SPI (Service Provider Interface). Therefore, you must create the following file: src/test/resources/META-INF/services/io.cucumber.core.backend.ObjectFactory With the following content: Shell org.soujava.demos.hotels.config.WeldCucumberObjectFactory We will have the Mapper convert our table into the Room from all models. Java @ApplicationScoped public class RoomDataTableMapper { @DataTableType public Room roomEntry(Map<String, String> entry) { return Room.builder() .number(Integer.parseInt(entry.get("number"))) .type(RoomType.valueOf(entry.get("type"))) .status(RoomStatus.valueOf(entry.get("status"))) .cleanStatus(CleanStatus.valueOf(entry.get("cleanStatus"))) .build(); } } With the whole test infrastructure done, the next step is to design the Step tests that will contain our test itself. Java @ApplicationScoped public class HotelRoomSteps { @Inject private RoomRepository repository; @Before public void cleanDatabase() { repository.deleteBy(); } @Given("the hotel management system is operational") public void theHotelManagementSystemIsOperational() { Assertions.assertThat(repository).as("RoomRepository should be initialized").isNotNull(); } @When("I register a room with number {int}") public void iRegisterARoomWithNumber(Integer number) { Room room = Room.builder() .number(number) .type(RoomType.STANDARD) .status(RoomStatus.AVAILABLE) .cleanStatus(CleanStatus.CLEAN) .build(); repository.save(room); } @Then("the room with number {int} should appear in the room list") public void theRoomWithNumberShouldAppearInTheRoomList(Integer number) { List<Room> rooms = repository.findAll(); Assertions.assertThat(rooms) .extracting(Room::getNumber) .contains(number); } @When("I register the following rooms:") public void iRegisterTheFollowingRooms(List<Room> rooms) { rooms.forEach(repository::save); } @Then("there should be {int} rooms available in the system") public void thereShouldBeRoomsAvailableInTheSystem(int expectedCount) { List<Room> rooms = repository.findAll(); Assertions.assertThat(rooms).hasSize(expectedCount); } @Given("a room with number {int} is registered as {word}") public void aRoomWithNumberIsRegisteredAs(Integer number, String statusName) { RoomStatus status = RoomStatus.valueOf(statusName); Room room = Room.builder() .number(number) .type(RoomType.STANDARD) .status(status) .cleanStatus(CleanStatus.CLEAN) .build(); repository.save(room); } @When("I mark the room {int} as {word}") public void iMarkTheRoomAs(Integer number, String newStatusName) { RoomStatus newStatus = RoomStatus.valueOf(newStatusName); Optional<Room> roomOpt = repository.findByNumber(number); Assertions.assertThat(roomOpt) .as("Room %s should exist", number) .isPresent(); Room updatedRoom = roomOpt.orElseThrow(); updatedRoom.update(newStatus); repository.save(updatedRoom); } @Then("the room {int} should be marked as {word}") public void theRoomShouldBeMarkedAs(Integer number, String expectedStatusName) { RoomStatus expectedStatus = RoomStatus.valueOf(expectedStatusName); Optional<Room> roomOpt = repository.findByNumber(number); Assertions.assertThat(roomOpt) .as("Room %s should exist", number) .isPresent() .get() .extracting(Room::getStatus) .isEqualTo(expectedStatus); } } Time to execute the test with: Shell mvn clean test Where you can see the results: Shell INFO: Connecting to Oracle NoSQL database at http://localhost:61325 using ON_PREMISES deployment type ✔ Given the hotel management system is operational # org.soujava.demos.hotels.HotelRoomSteps.theHotelManagementSystemIsOperational() ✔ And a room with number 101 is registered as AVAILABLE # org.soujava.demos.hotels.HotelRoomSteps.aRoomWithNumberIsRegisteredAs(java.lang.Integer,java.lang.String) ✔ When I mark the room 101 as OUT_OF_SERVICE # org.soujava.demos.hotels.HotelRoomSteps.iMarkTheRoomAs(java.lang.Integer,java.lang.String) ✔ Then the room 101 should be marked as OUT_OF_SERVICE # org.soujava.demos.hotels.HotelRoomSteps.theRoomShouldBeMarkedAs(java.lang.Integer,java.lang.String) Oct 21, 2025 6:18:43 PM org.jboss.weld.environment.se.WeldContainer shutdown INFO: WELD-ENV-002001: Weld SE container fc4b3b51-fba8-4ea6-9cef-42bcee97d220 shut down [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.231 s -- in org.soujava.demos.hotels.RunCucumberTest [INFO] Running org.soujava.demos.hotels.MongoDBTest [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.007 s -- in org.soujava.demos.hotels.MongoDBTest [INFO] [INFO] Results: [INFO] [INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0 [INFO] [INFO] Conclusion By combining domain-driven design (DDD) and behavior-driven development (BDD), developers can move beyond technical correctness and build software that truly mirrors business intent. DDD gives structure to the domain, ensuring that models capture real-world concepts with precision, while BDD ensures those models behave as expected through clear, testable scenarios written in the language of the business itself. In this article, you’ve learned how to connect these two worlds using Oracle NoSQL, Eclipse JNoSQL, and Jakarta EE — from defining your domain to running real behavioral tests powered by Cucumber and CDI. This synergy transforms tests into living documentation, bridging the gap between engineers and stakeholders and ensuring your system remains aligned with business goals as it evolves. You can go deep and combine DDD with BDD. In the Domain-Driven Design with Java book, you can find a good starting point for understanding why DDD is still important to us. It expands on the ideas shared here, showing how DDD and BDD together can lead to simpler, more maintainable, and business-focused software. This kind delivers actual value beyond requirements.
In a previous article, we explored how to implement full-text search in PostgreSQL using Hibernate 6 and the posjsonhelper library. We built queries with to_tsvector, to_tsquery, and their simpler wrappers for the plainto_tsquery, phraseto_tsquery, and websearch_to_tsquery functions. This time, we’ll extend that foundation and explore how to rank search results based on their relevance using PostgreSQL’s built-in ranking functions like ts_rank and ts_rank_cd. We’ll also demonstrate how to use them programmatically in Hibernate through the posjsonhelper library. Why Ranking Matters A typical full-text search returns all matching records, but not necessarily in a meaningful order. For example, imagine searching “Postgres ranking” in a database of articles. Some records might mention the term once, while others include it repeatedly or in their titles — yet both would appear equally if we rely only on the @@ operator. That’s where ranking functions come in. ts_rank — computes a relevance score based on term frequency and inverse document frequency (TF/IDF).ts_rank_cd — a variant using cover density ranking, favoring documents where search terms appear close together. (For more information about built-in ranking methods, check Postgres documentation.) Ranking lets your application: Prioritize results by relevance.Improve search UX by showing the best matches first.Keep using PostgreSQL’s native features — no need for external search engines. When Full-Text Search Alone Is Enough While modern projects often explore vector similarity search (e.g., using pgvector with embeddings), not every system needs that level of complexity. Full-text search — especially when enhanced with ranking — is usually sufficient when: You want exact or linguistic matches, not semantic ones.Your dataset is moderate in size (hundreds of thousands, not billions of rows).You want explainable ranking logic (based on term frequency and proximity).You need to stay entirely within PostgreSQL — no additional infrastructure. If your users expect “semantic” search, then vector embeddings are worth considering. But for structured text like product descriptions, articles, or messages, Postgres full-text search with ranking is often all you need. PostgreSQL Ranking Functions Overview Both ts_rank and ts_rank_cd take a tsvector (your indexed document) and a tsquery (your search query): SQL SELECT ts_rank(to_tsvector('english', content), to_tsquery('english', 'postgres & ranking')); They return a numeric score representing how relevant the document is to the search query. You can then order results using this score: SQL ORDER BY ts_rank(to_tsvector('english', content), to_tsquery('english', 'postgres & ranking')) DESC; Implementing Ranking in Hibernate Using posjsonhelper The posjsonhelper library adds type-safe, Criteria-API-compatible support for PostgreSQL functions like to_tsvector, to_tsquery, and text operators. Although it doesn’t include wrappers at this moment for ranking functions, you can easily invoke them through Hibernate’s CriteriaBuilder#function method. Let’s see how this works in practice. Example 1: Ranking With ts_rank Java public List<Item> findItemsByWebSearchToTSQuerySortedByTsRank(String phrase, boolean ascSort) { CriteriaBuilder cb = entityManager.getCriteriaBuilder(); CriteriaQuery<Item> cq = cb.createQuery(Item.class); Root<Item> root = cq.from(Item.class); // Build weighted tsvector using posjsonhelper functions Expression<String> shortNameVec = cb.function("setweight", String.class, new TSVectorFunction(root.get("shortName"), new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), (NodeBuilder) cb), cb.literal("A") ); Expression<String> fullNameVec = cb.function("setweight", String.class, new TSVectorFunction(root.get("fullName"), new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), (NodeBuilder) cb), cb.literal("B") ); Expression<String> shortDescriptionVec = cb.function("setweight", String.class, new TSVectorFunction(root.get("shortDescription"), new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), (NodeBuilder) cb), cb.literal("C") ); Expression<String> fullDescriptionVec = cb.function("setweight", String.class, new TSVectorFunction(root.get("fullDescription"), new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), (NodeBuilder) cb), cb.literal("D") ); // Concatenate tsvectors (|| operator) SqmExpression<String> fullVector = (SqmExpression<String>) cb.concat(cb.concat(shortNameVec, fullNameVec), cb.concat(shortDescriptionVec, fullDescriptionVec)); // Build tsquery Expression<String> queryExpr = new WebsearchToTSQueryFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, phrase); // WHERE clause using @@ operator TextOperatorFunction matches = new TextOperatorFunction((NodeBuilder) cb, fullVector, new WebsearchToTSQueryFunction((NodeBuilder) cb, new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), phrase), hibernateContext); cq.where(matches); // Ranking Expression<Double> rankExpr = cb.function( "ts_rank", Double.class, fullVector, queryExpr ); cq.orderBy(ascSort ? cb.asc(rankExpr) : cb.desc(rankExpr)); return entityManager.createQuery(cq).getResultList(); } The full code example can be found here. This query produces SQL similar to: SQL select i1_0.id, i1_0.full_description, i1_0.full_name, i1_0.short_description, i1_0.short_name from item i1_0 where ( ( setweight(to_tsvector(?::regconfig, i1_0.short_name), 'A')||setweight( to_tsvector(?::regconfig, i1_0.full_name), 'B' ) )||( setweight(to_tsvector(?::regconfig, i1_0.short_description), 'C')||setweight( to_tsvector(?::regconfig, i1_0.full_description), 'D' ) ) ) @@ websearch_to_tsquery(?::regconfig, ?) order by ts_rank(((setweight(to_tsvector(?::regconfig, i1_0.short_name), 'A')||setweight(to_tsvector(?::regconfig, i1_0.full_name), 'B'))||(setweight(to_tsvector(?::regconfig, i1_0.short_description), 'C')||setweight(to_tsvector(?::regconfig, i1_0.full_description), 'D'))), websearch_to_tsquery('english', ?)) The same thing can be implemented with HQL language, like below: Java public List<Item> findItemsByWebSearchToTSQuerySortedByTsRankInHQL(String phrase, boolean ascSort) { String statement = "from Item as item where " + "text_operator_function(" + // text_operator_function - start "concat(" + // main concat - start "concat(" + // first concat - start "function('setweight', to_tsvector('%1$s', item.shortName), 'A')" + "," + "function('setweight', to_tsvector('%1$s', item.fullName), 'B')" + ")" + // first concat - end "," + // main concat - separator "concat(" + // second concat - start "function('setweight', to_tsvector('%1$s', item.shortDescription), 'C')" + "," + "function('setweight', to_tsvector('%1$s', item.fullDescription), 'D')" + ")" + // first second - end ")" + // main concat - end "," + // text_operator_function - separator "websearch_to_tsquery(cast_operator_function('%1$s','regconfig'), :phrase)" + // websearch_to_tsquery operator ")" + // text_operator_function - end " order by " + // order - start "function('ts_rank', " + // ts_rank function - start "concat(" + // main concat - start "concat(" + // first concat - start "function('setweight', to_tsvector('%1$s', item.shortName), 'A')" + "," + "function('setweight', to_tsvector('%1$s', item.fullName), 'B')" + ")" + // first concat - end "," + // main concat - separator "concat(" + // second concat - start "function('setweight', to_tsvector('%1$s', item.shortDescription), 'C')" + "," + "function('setweight', to_tsvector('%1$s', item.fullDescription), 'D')" + ")" + // first second - end ")" + // main concat - end "," + // ts_rank function - separator "websearch_to_tsquery(cast_operator_function('%1$s','regconfig'), :phrase)" + // websearch_to_tsquery operator ")" + // ts_rank function - end (ascSort ? " asc" : "desc"); TypedQuery<Item> query = entityManager.createQuery(statement.formatted(ENGLISH_CONFIGURATION), Item.class); query.setParameter("phrase", phrase); return query.getResultList(); } The full code example can be found here. Example 2: Ranking With ts_rankd_cd Java public List<Item> findItemsByWebSearchToTSQuerySortedByTsRankCd(String phrase, boolean ascSort) { CriteriaBuilder cb = entityManager.getCriteriaBuilder(); CriteriaQuery<Item> cq = cb.createQuery(Item.class); Root<Item> root = cq.from(Item.class); // Build weighted tsvector using posjsonhelper functions Expression<String> shortNameVec = cb.function("setweight", String.class, new TSVectorFunction(root.get("shortName"), new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), (NodeBuilder) cb), cb.literal("A") ); Expression<String> fullNameVec = cb.function("setweight", String.class, new TSVectorFunction(root.get("fullName"), new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), (NodeBuilder) cb), cb.literal("B") ); Expression<String> shortDescriptionVec = cb.function("setweight", String.class, new TSVectorFunction(root.get("shortDescription"), new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), (NodeBuilder) cb), cb.literal("C") ); Expression<String> fullDescriptionVec = cb.function("setweight", String.class, new TSVectorFunction(root.get("fullDescription"), new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), (NodeBuilder) cb), cb.literal("D") ); // Concatenate tsvectors (|| operator) SqmExpression<String> fullVector = (SqmExpression<String>) cb.concat(cb.concat(shortNameVec, fullNameVec), cb.concat(shortDescriptionVec, fullDescriptionVec)); // Build tsquery Expression<String> queryExpr = new WebsearchToTSQueryFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, phrase); // WHERE clause using @@ operator TextOperatorFunction matches = new TextOperatorFunction((NodeBuilder) cb, fullVector, new WebsearchToTSQueryFunction((NodeBuilder) cb, new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), phrase), hibernateContext); cq.where(matches); // Ranking Expression<Double> rankExpr = cb.function( "ts_rank_cd", Double.class, fullVector, queryExpr ); cq.orderBy(ascSort ? cb.asc(rankExpr) : cb.desc(rankExpr)); return entityManager.createQuery(cq).getResultList(); } Code looks almost identical; the only difference is the use of the ts_rank_cd function. The full code example can be found here. Example 3: Custom Weights and Normalization ts_rank and ts_rank_cd support additional arguments — for example, passing a custom weight array to control how much each part of your text contributes to the overall rank, just like in the example below: Java public List<Item> findItemsByWebSearchToTSQuerySortedByTsRankWithCustomWeight(String phrase, boolean ascSort, double[] weights) { CriteriaBuilder cb = entityManager.getCriteriaBuilder(); CriteriaQuery<Item> cq = cb.createQuery(Item.class); Root<Item> root = cq.from(Item.class); // Build weighted tsvector using posjsonhelper functions Expression<String> shortNameVec = cb.function("setweight", String.class, new TSVectorFunction(root.get("shortName"), new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), (NodeBuilder) cb), cb.literal("A") ); Expression<String> fullNameVec = cb.function("setweight", String.class, new TSVectorFunction(root.get("fullName"), new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), (NodeBuilder) cb), cb.literal("B") ); Expression<String> shortDescriptionVec = cb.function("setweight", String.class, new TSVectorFunction(root.get("shortDescription"), new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), (NodeBuilder) cb), cb.literal("C") ); Expression<String> fullDescriptionVec = cb.function("setweight", String.class, new TSVectorFunction(root.get("fullDescription"), new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), (NodeBuilder) cb), cb.literal("D") ); // Concatenate tsvectors (|| operator) SqmExpression<String> fullVector = (SqmExpression<String>) cb.concat(cb.concat(shortNameVec, fullNameVec), cb.concat(shortDescriptionVec, fullDescriptionVec)); // Build tsquery Expression<String> queryExpr = new WebsearchToTSQueryFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, phrase); // WHERE clause using @@ operator TextOperatorFunction matches = new TextOperatorFunction((NodeBuilder) cb, fullVector, new WebsearchToTSQueryFunction((NodeBuilder) cb, new RegconfigTypeCastOperatorFunction((NodeBuilder) cb, ENGLISH_CONFIGURATION, hibernateContext), phrase), hibernateContext); cq.where(matches); // Ranking Expression<Double> rankExpr = cb.function( "ts_rank", Double.class, new ArrayFunction<>((NodeBuilder) cb, Arrays.stream(weights).mapToObj(w -> (SqmExpression<Double>) cb.literal(w)).toList(), hibernateContext) , fullVector , queryExpr ); cq.orderBy(ascSort ? cb.asc(rankExpr) : cb.desc(rankExpr)); return entityManager.createQuery(cq).getResultList(); } Full code example can be found here; the same example, but implemented with HQL, can be found here. Performance Considerations While ranking functions such as ts_rank and ts_rank_cd significantly improve the quality of search results, they also introduce a computational and I/O overhead that’s important to understand, especially when operating on large datasets. When PostgreSQL executes a full-text search with ranking, it typically performs two main operations: Index-based filtering — The @@ operator uses a GIN or GiST index to quickly locate the documents that match the query terms. This part is very efficient and happens almost entirely in memory.Ranking and sorting — Once matching rows are found, PostgreSQL must read the corresponding tsvector values from disk (or cache) and compute the relevance score for each row. This step involves I/O operations and CPU processing, since PostgreSQL needs to access the document lexemes and calculate how well they match the query. If the matching dataset is large, this can become a noticeable bottleneck — even when proper indexes are in place. In other words, the index helps PostgreSQL find relevant rows quickly, but ranking requires touching each matched row’s data, which may trigger additional reads from storage. The cost grows roughly with the number of results that pass the @@ filter. For example: Ranking 100 results is typically negligible (a few milliseconds).Ranking 10,000 or more results may involve enough I/O to impact response times, especially on spinning disks or when data doesn’t fit in shared buffers.Sorting ranked results (e.g., ORDER BY ts_rank(...) DESC) adds additional work, since PostgreSQL must maintain an in-memory or temporary sort buffer. To mitigate these effects: Always filter first, and rank only the subset of matched rows.Consider using precomputed tsvector columns and GIN indexes to minimize recomputation.For very high-volume search workloads, caching top results or using a dedicated search engine (like Elasticsearch) might be beneficial. Ranking is a powerful feature, but it’s not free. When applied to a large dataset, even with correct indexing, the I/O required to fetch and score every matched row can impact overall query performance. Designing queries and indexes with this in mind will ensure that your search remains both relevant and responsive. Conclusion By extending our earlier full-text search implementation with ts_rank and ts_rank_cd, we can now sort results by relevance, not just existence. This approach combines the power of PostgreSQL’s ranking functions with the flexibility of Hibernate’s Criteria API, keeping your queries both expressive and type-safe. While more advanced vector search solutions exist, for most business applications, PostgreSQL’s full-text search with ranking remains a simple, explainable, and highly effective solution.
Key Takeaways REST APIs excel in simplicity, caching, and microservices architecture, with widespread adoption and a mature tooling ecosystemGraphQL provides precise data fetching, reduces over-fetching, and offers superior flexibility for complex data relationshipsPerformance varies by use case: REST wins for simple CRUD operations and caching scenarios, while GraphQL shines in mobile apps and complex queriesAPI Gateway integration is crucial for managing both approaches effectively, providing unified security, monitoring, and transformation capabilitiesNo universal winner: The choice depends on project requirements, team expertise, and specific technical constraints rather than inherent superiority Understanding REST APIs and GraphQL: The Foundation of Modern API Architecture When evaluating modern API architectures, developers frequently encounter the question: "What is a RESTful API, and how does it compare to GraphQL?" According to recent industry data, over 61% of organizations are now using GraphQL, while REST continues to dominate enterprise environments. Understanding both approaches is essential for making informed architectural decisions. What Is a RESTful API? A RESTful API (Representational State Transfer) is an architectural style that leverages HTTP protocols to create scalable web services. REST and RESTful services follow six key principles: statelessness, client-server architecture, cacheability, layered system, uniform interface, and code on demand (optional). Unlike the traditional SOAP protocol and REST debate, where SOAP and REST discussions centered on protocol complexity, RESTful APIs embrace simplicity and web-native patterns. The fundamental concept behind RESTful API architecture involves treating every piece of data as a resource, accessible through standard HTTP methods (GET, POST, PUT, DELETE). This approach has made REST API RESTful implementations the backbone of countless web applications, from simple CRUD operations to complex enterprise systems. What Is GraphQL? GraphQL represents a paradigm shift from traditional REST approaches. Developed by Facebook in 2012 and open-sourced in 2015, GraphQL is a query language and runtime for APIs that enables clients to request exactly the data they need. Unlike REST's resource-based approach, GraphQL operates through a single endpoint that can handle complex data fetching scenarios. The core innovation of GraphQL lies in its declarative data fetching model. When you need to perform a GraphQL query to get the number of customers along with their recent orders and contact information, a single request can retrieve all related data. This contrasts sharply with REST, where multiple API calls would be necessary. GraphQL mutation capabilities further extend its functionality, allowing clients to modify data using the same expressive query language. This unified approach to both reading and writing data represents a significant departure from REST's verb-based HTTP methods. Historical Context The evolution from SOAP protocol vs REST to modern GraphQL reflects changing application needs. REST APIs have revolutionized how computer systems communicate over the internet, providing a secure, scalable interface that follows specific architectural rules. However, as applications became more sophisticated and mobile-first, the limitations of REST's fixed data structures became apparent. GraphQL emerged as a response to these challenges, particularly the over-fetching and under-fetching problems inherent in REST architectures. While REST remains excellent for many use cases, GraphQL's client-driven approach addresses specific pain points in modern application development. Key Differences: When to Choose GraphQL vs. REST API The choice between GraphQL and REST involves understanding fundamental differences in how each approach handles data fetching, performance optimization, and development workflows. Data Fetching Approaches REST uses multiple endpoints for each resource, requiring separate HTTP calls for different data types. A typical REST implementation might require: HTTP GET /api/users/123 GET /api/users/123/orders This multi-request pattern often leads to over-fetching (receiving unnecessary data) or under-fetching (requiring additional requests). In contrast, GraphQL allows clients to specify exactly what data they need in a single request: Shell query { user(id: 123) { name email orders { id total items { name price } } } } Performance Considerations Performance characteristics vary significantly between approaches. RESTful APIs excel in scenarios where caching is crucial, as HTTP caching mechanisms are well-established and widely supported. The stateless nature of REST makes it highly scalable for simple operations. GraphQL shines in bandwidth-constrained environments, particularly mobile applications. By fetching only required data, GraphQL can reduce payload sizes by 30-50% compared to equivalent REST implementations. However, this efficiency comes with increased server-side complexity, as resolvers must efficiently handle arbitrary query combinations. Development Experience REST's simplicity makes it accessible to developers at all skill levels. The HTTP-based approach aligns naturally with web development patterns, and debugging tools are mature and widely available. RESTful API documentation follows established conventions, making integration straightforward. GraphQL offers powerful introspection capabilities and schema-first development, but requires a steeper learning curve. The strongly-typed schema provides excellent developer experience through auto-completion and compile-time validation, but teams must invest time in understanding GraphQL-specific concepts like resolvers, fragments, and query optimization. Scalability Factors REST is well-suited for microservices architectures, where each service exposes functionality through well-defined APIs. The stateless nature of RESTful services makes horizontal scaling straightforward, and load balancing strategies are well-established. GraphQL presents unique scalability challenges in distributed systems. Query complexity can vary dramatically, making resource planning difficult. Advanced GraphQL implementations require sophisticated caching strategies and query analysis to prevent performance degradation. Technical Implementation: REST vs. GraphQL in Practice Understanding the practical implementation details of both approaches helps developers make informed decisions about which technology best fits their specific requirements. REST API Implementation Patterns RESTful API implementation follows well-established patterns centered around HTTP methods and resource-based URLs. A typical REST API for user management might include: HTTP GET /api/users # List all users POST /api/users # Create new user GET /api/users/123 # Get specific user PUT /api/users/123 # Update user DELETE /api/users/123 # Delete user This approach leverages HTTP's built-in semantics, making RESTful APIs intuitive for developers familiar with web protocols. Status codes provide clear communication about operation results, and stateless communication ensures scalability. Versioning in REST typically involves URL-based strategies (/v1/users, /v2/users) or header-based approaches. While this can lead to API proliferation, it provides clear backward compatibility guarantees. GraphQL Implementation Essentials GraphQL implementation begins with schema definition, establishing the contract between client and server: Shell type User { id: ID! name: String! email: String! orders: [Order!]! } type Order { id: ID! total: Float! createdAt: String! } type Query { users: [User!]! user(id: ID!): User } type Mutation { createUser(name: String!, email: String!): User! } GraphQL mutation operations provide a structured approach to data modification, maintaining the same expressive power as queries. Resolvers handle the actual data fetching logic, allowing for flexible backend integration. Security Considerations Both approaches require careful security implementation, but with different focus areas. RESTful APIs benefit from standard HTTP security practices: authentication headers, CORS policies, and input validation at the endpoint level. GraphQL introduces unique security challenges, particularly around query complexity and depth limiting. Malicious clients could potentially craft expensive queries that strain server resources. Implementing query complexity analysis, depth limiting, and timeout mechanisms becomes crucial for GraphQL security. Error Handling and Monitoring REST relies on HTTP status codes for error communication, providing a standardized approach that integrates well with existing monitoring tools. Error responses follow predictable patterns, making debugging straightforward. GraphQL uses a different error model, where HTTP status is typically 200 even for errors, with actual error information embedded in the response payload. This approach requires specialized monitoring tools and error handling strategies but provides more detailed error context. API Gateway Management: Optimizing GraphQL and REST APIs Modern API management requires sophisticated gateway solutions that can handle both REST and GraphQL effectively. API gateways serve as the critical infrastructure layer that enables organizations to manage, secure, and optimize their API ecosystems regardless of the underlying architecture. Managing RESTful APIs With API Gateway RESTful APIs integrate naturally with traditional API gateway patterns. Standard gateway features like route configuration, load balancing, and protocol translation work seamlessly with REST's resource-based approach. Caching strategies are particularly effective with RESTful services, as the predictable URL patterns and HTTP semantics enable sophisticated caching policies. API gateways excel at transforming REST requests and responses, enabling legacy system integration and API evolution without breaking existing clients. Rate limiting and throttling policies can be applied at the resource level, providing granular control over API consumption. GraphQL API Gateway Integration GraphQL presents unique challenges and opportunities for API gateway integration. Modern gateways like API7 provide GraphQL-specific features, including schema stitching, query complexity analysis, and GraphQL-to-REST transformation capabilities. Query complexity analysis becomes crucial for protecting backend services from expensive operations. API gateways can implement sophisticated policies that evaluate query depth, field count, and estimated execution time before forwarding requests to GraphQL servers. Schema federation support allows organizations to compose multiple GraphQL services into a unified API surface, with the gateway handling query planning and execution across distributed services. Unified API Management Approach Leading API gateway solutions support multi-protocol environments, enabling organizations to manage both RESTful APIs and GraphQL services through a single management plane. This unified approach provides consistent authentication, authorization, monitoring, and analytics across all API types. Developer portal integration becomes particularly valuable in mixed environments, as it can generate documentation and provide testing interfaces for both REST endpoints and GraphQL schemas. This consistency improves developer experience and reduces onboarding complexity. Performance Optimization Techniques API gateways enable sophisticated performance optimization for both API types. Intelligent caching can be applied to GraphQL queries based on query fingerprinting and field-level cache policies. For RESTful APIs, traditional HTTP caching mechanisms provide excellent performance benefits. Request and response transformation capabilities allow gateways to optimize data formats, compress payloads, and aggregate multiple backend calls into a single client response. Global load balancing and failover mechanisms ensure high availability for both GraphQL and REST services. Making the Right Choice: Decision Framework and Future Trends Selecting between GraphQL and REST requires a structured evaluation of technical requirements, team capabilities, and long-term strategic goals. Rather than viewing this as a binary choice, successful organizations often adopt hybrid approaches that leverage the strengths of both paradigms. Decision Criteria Matrix Project requirements should drive the technology choice. RESTful APIs excel in scenarios requiring: Simple CRUD operations with well-defined resourcesHeavy caching requirementsIntegration with existing HTTP-based infrastructureTeam familiarity with web standardsMicroservices architectures with clear service boundaries GraphQL provides advantages when projects involve: Complex data relationships and nested queriesMobile applications with bandwidth constraintsRapidly evolving client requirementsMultiple client types with different data needsReal-time features require subscription support Use Case Scenarios Enterprise applications often benefit from REST's maturity and simplicity. E-commerce platforms, content management systems, and traditional web applications typically align well with RESTful service patterns. The predictable structure and extensive tooling ecosystem make REST an excellent choice for teams building standard business applications. GraphQL shines in scenarios requiring flexible data access patterns. Social media platforms, analytics dashboards, and mobile applications often see significant benefits from GraphQL's precise data fetching capabilities. When you need to execute a GraphQL query to get the number of customers along with their transaction history and preferences, the single-request efficiency becomes invaluable. Future Outlook and Trends The API landscape continues evolving, with both REST and GraphQL finding distinct niches. REST maintains strong adoption in enterprise environments, while GraphQL usage grows in frontend-driven applications and mobile development. Emerging trends include hybrid approaches where REST APIs serve as data sources for GraphQL gateways, providing the best of both worlds. API gateway evolution increasingly focuses on protocol translation and unified management capabilities. Industry adoption data shows continued growth for both approaches, suggesting that the future involves coexistence rather than replacement. Organizations are increasingly adopting API-first strategies that can accommodate multiple paradigms based on specific use case requirements. Conclusion and Recommendations The GraphQL vs REST debate oversimplifies what should be a nuanced technical decision. Both approaches offer distinct advantages, and the optimal choice depends on specific project requirements, team expertise, and organizational constraints. RESTful APIs remain the gold standard for simple, cacheable, and well-understood interaction patterns. Their alignment with HTTP semantics, mature tooling ecosystem, and widespread developer familiarity make them an excellent default choice for many applications. GraphQL provides compelling advantages for applications requiring flexible data access, precise resource utilization, and rapid iteration. The investment in learning GraphQL concepts pays dividends in scenarios where its strengths align with project needs. The most successful API strategies often involve thoughtful integration of both approaches, leveraged through sophisticated API gateway solutions that can manage, secure, and optimize diverse API ecosystems. As API management continues evolving, the ability to support multiple paradigms becomes increasingly valuable for maintaining architectural flexibility and meeting diverse client requirements. Rather than asking "which is better," developers should ask "which approach best serves my specific requirements?" The answer will vary based on context, but understanding the strengths and limitations of both GraphQL and REST enables informed decisions that drive successful API implementations.
Data quality failures don't announce themselves. They compound silently — a malformed timestamp here, a negative revenue figure there — until a quarterly board deck shows impossible numbers or an ML model degrades into uselessness. A 2023 Gartner study pegged the cost at $12.9 million annually per organization, but that figure misses the hidden expense: engineering time spent firefighting data incidents instead of building features. The traditional approach treats validation as a post-processing step. You write data to storage, then run Great Expectations or Deequ checks, discover failures, and either fix the pipeline or quarantine bad records. This pattern creates a fundamental gap: the window between data landing and validation completion. In high-throughput lakehouses processing terabytes daily, that window can represent millions of corrupted records propagating downstream before anyone notices. Delta Expectations — a feature of Databricks' Delta Live Tables (DLT) — collapses that window to zero by enforcing validation during the write transaction itself. This isn't just faster validation; it's an architectural shift from reactive data quality to proactive data contracts. Why Write-Time Validation Changes the Game Traditional ETL validation operates as an external auditor. You extract data, transform it, load it into a Delta table, and then check if it's valid: Extract → Transform → Load → Validate → (Quarantine/Repair) This sequence has consequences: Storage pollution: Invalid data physically lands in your lakehouse. Even if you delete it later, it exists in the transaction log and occupies storage until VACUUM runs.Downstream propagation: Between write completion and validation failure, downstream consumers may have already read the bad data. Scheduled jobs don't wait for validation results.Cascading failures: If validation discovers issues hours after ingestion, you're debugging a cold trail. Which upstream system sent bad data? Was it a transient API failure or a schema change? Delta Expectations inverts this model by embedding validation into the write path: Extract → Transform → Validate+Load (atomic) The validation logic executes during Spark's DataFrame evaluation, before the Delta transaction commits. A failed expectation can abort the write entirely, drop invalid records, or log violations while continuing — but critically, the validation decision happens before data persistence. The Architecture: How Expectations Actually Work Delta Expectations aren't modifications to Delta Lake's transaction protocol itself. They're a DLT framework feature that leverages Spark's lazy evaluation and Delta's atomicity guarantees. When you define an expectation like: Python @dlt.expect_or_drop("valid_price", "price > 0") DLT injects a filter operation into the Spark logical plan. During execution: Evaluation phase: Spark computes the expectation predicate (price > 0) for each record as part of the standard DataFrame transformation graphAction phase: Records are partitioned based on validation results: FAIL mode: If any record fails, Spark throws an exception before the Delta write API is invokedDROP mode: Failed records are filtered from the DataFrame before being passed to .write.format("delta")WARN mode: All records proceed to write, but DLT logs metrics about failuresPersistence phase: Only the valid subset (or all records in WARN mode) participates in the Delta transaction This means validation happens at Spark execution time, not Delta commit time. The distinction matters: Delta's ACID guarantees ensure the write is atomic, but the expectation logic runs earlier in the execution pipeline. The phrase "atomic validation" refers to the fact that validation and write are part of the same Spark job — not that expectations are integrated into Delta's transaction protocol. Critical implication: Expectations operate on data in motion (DataFrames) rather than data at rest (Delta tables). This is why they can prevent invalid data from ever being written, but also why they can't validate data already in a table without reprocessing it. Implementation Patterns: From Basic to Production-Grade Pattern 1: Layered Validation With Quarantine Production pipelines don't just drop bad data — they preserve it for debugging. The Bronze-Silver-Gold medallion architecture naturally supports this: Python # Bronze: Accept everything, no expectations @dlt.table( comment="Raw ingestion, no quality gates" ) def bronze_orders(): return spark.readStream.format("cloudFiles") \ .option("cloudFiles.format", "json") \ .load("/mnt/landing/orders") # Silver: Strict validation with quarantine @dlt.table(comment="Validated orders meeting business rules") @dlt.expect_or_drop("valid_order_id", "order_id IS NOT NULL") @dlt.expect_or_drop("price_positive", "price > 0 AND price < 1000000") @dlt.expect_or_drop("valid_date", "order_date <= current_date()") @dlt.expect("suspicious_quantity", "quantity < 10000") # WARN only def silver_orders(): return dlt.read_stream("bronze_orders") # Quarantine: Capture failed records @dlt.table(comment="Orders that failed Silver validation") def quarantine_orders(): bronze = dlt.read_stream("bronze_orders") return bronze.filter( (col("order_id").isNull()) | (col("price") <= 0) | (col("price") >= 1000000) | (col("order_date") > current_date()) ).withColumn("quarantine_timestamp", current_timestamp()) \ .withColumn( "failure_reason", when(col("order_id").isNull(), "null_order_id") .when(col("price") <= 0, "negative_price") .when(col("price") >= 1000000, "price_too_high") .otherwise("invalid_date") ) Why this works: Bronze preserves source fidelity for audit and replaySilver expectations ensure downstream consumers never see invalid dataQuarantine tables enable post-mortem analysis without data lossDLT's metrics UI shows expectation pass rates in real time Operational note: The quarantine pattern requires duplicating expectation logic. In practice, extract these to a shared function to maintain DRY code. Pattern 2: Streaming Watermarks and Late Data Delta Expectations behave differently in streaming contexts. With late-arriving data, you need to coordinate watermarks and validation: Python @dlt.table @dlt.expect_or_drop("within_watermark", "event_timestamp > current_timestamp() - interval 24 hours") def streaming_events(): return (spark.readStream .format("kafka") .option("subscribe", "events") .load() .withWatermark("event_timestamp", "1 hour") .select("event_timestamp", "user_id", "event_type")) Critical detail: The expectation runs after the watermark is applied. Records older than the watermark are already dropped by Spark Structured Streaming before expectations are evaluated. This creates a layered filter: Watermark drops events older than the thresholdExpectations validate remaining recordsDelta writes persist only valid, timely data Pattern 3: Cross-Table Validation and Referential Integrity Expectations can enforce relationships across tables using joins: Python @dlt.table @dlt.expect_or_drop("valid_customer", "customer_id IN (SELECT customer_id FROM LIVE.dim_customers)") def orders_with_referential_integrity(): return dlt.read("bronze_orders") Performance caveat: This triggers a broadcast join at every micro-batch. If dim_customers is large, this becomes expensive. Prefer precomputed lookup tables or alternative validation strategies for large dimensions. Performance Considerations: What Expectations Actually Cost Delta Expectations aren't free. Each expectation adds a predicate evaluation to your Spark execution plan. The overhead depends on: Predicate complexity: Simple column comparisons add negligible cost; regex or UDFs can be expensive.Record volume: Expectations scale linearly with data volume.Execution mode: WARN — logs all failures: metrics overhead but no impact on persisted data.DROP — filters early, which may reduce shuffle.FAIL — aborts transaction immediately on failure. Optimization strategies: Order DROP expectations to filter as early as possible.Combine similar validations (price > 0 AND price < 1000000).Avoid UDFs; use vectorized SQL expressions.Broadcast dimension tables wisely. Integration Beyond DLT: Orchestration and Observability Airflow Integration DLT pipelines expose REST APIs for orchestration. When FAIL expectations trigger, the API returns a non-zero exit code, enabling alerting via tools like PagerDuty. Metrics Export DLT stores expectation metrics in event logs. Query and export these to observability platforms for real-time monitoring and alerting. Unity Catalog and Data Lineage Register DLT tables in Unity Catalog for governance, access control, and lineage tracing. Expectation metadata in cataloged tables builds trust and traceability. When Not to Use Delta Expectations Non-DLT pipelines: Expectations are DLT features only.Complex statistical validation: For distributional anomalies or advanced data profiling, use Deequ or Great Expectations.Historical data: Expectations only run on new writes; reprocessing is required for validation of existing data.Multi-table atomicity: Can't coordinate transactional constraints across multiple tables.Extreme performance constraints: Expectations add compute overhead — test with your SLA. Alternative Approaches: Understanding Trade-offs ApproachTimingBest ForLimitationsDelta ExpectationsWrite-time (DLT)Standard validations, quarantine, atomicityDLT only, not open-source DeltaGreat ExpectationsPost-writeProfiling, docs, contract CI/CDReactive; data already writtenDelta CHECK ConstraintsCommit timeBasic per-column invariantsLimited expression powerDeequPost-writeLarge-scale stats/anomaly detectionJVM/Scala only; more setupCustom Spark FiltersWrite-timeFully custom casesNo built-in metrics Production Readiness Checklist Testing: Inject failure scenarios to verify enforcement.Confirm quarantine and warning behaviors.Test with evolving schemas.Monitoring: Alert on fail-rate spikes.Observe the expectation evaluation time as complexity grows.Governance: Document purposes of all expectations.Version control expectation logic.Operational: Map escalation paths for FAIL triggers.Procedures for backfills and expectation changes. The Future: From Validation to Contracts Delta Expectations represent a step toward automated, write-time data contracts. The next evolution will be: Auto-generated expectations informed by historic profiles.Contract versioning tied to pipeline and schema releases.Organization-wide contract publishing for data mesh domains.Integration with schema registries for full data lifecycle coverage. In modern data architecture, velocity without reliability is just expensive noise. Delta Expectations transform data quality from a post-mortem exercise into a real-time guarantee — ensuring that the data powering your analytics, ML models, and business decisions meets the standards required before it ever reaches production. That shift from reactive validation to proactive contracts is the cornerstone of trustworthy data systems.
Being a part of the software team, you would have heard about end-to-end or E2E testing. The testing team ideally prefers to have a round of end-to-end testing to ensure the functional working of the application. Every software application should undergo end-to-end testing to ensure it functions as specified. This testing approach builds confidence in the system and helps development teams determine whether the software is ready for production deployment. In this tutorial, I will guide you through what end-to-end testing is, why it’s important, and how effectively you can implement it in your software project. What Is End-to-End Testing? End-to-end testing refers to testing the software from the end user’s perspective. It verifies that all software modules function correctly under real-world conditions. The core purpose of end-to-end testing is to replicate the real-world user experience by testing the application’s workflow from beginning to end. Let’s take an example of the Parabank demo banking application, where different modules like registration, login, accounts, transactions, payment, and reports modules were built in isolation. Considering end-to-end testing, we should perform a comprehensive test of the end-user journey beginning from registration, then verifying the login functionality, the accounts module by creating a new bank account, performing transactions such as transferring money to different accounts, and checking the status report of the transaction. These tests mimic real user interactions, allowing us to identify issues within the application as it is used from start to finish. What Is the Goal of End-to-End Testing? The primary goal of end-to-end testing is to ensure that all software modules function correctly in real-world scenarios. Another key objective is to identify and resolve hidden issues before the software is released to production. For example, performing an end-to-end test of a loan application that allows users to manually fill in the details and check for the eligibility of loans. By performing end-to-end testing, we can ensure that the user will be able to complete the journey without any issues. By performing end-to-end testing, we not only check for the functionalities and features, but it also allows us to get feedback on the overall user experience. When to Perform End-to-End Testing? End-to-end testing is usually conducted after completing the functional and system testing. It is better to perform it before major releases to confirm that the application works from the end user’s perspective without any errors. It may help us uncover hidden issues as we combine all the modules and test the overall application from beginning to end, just as an end user would. It is recommended to integrate end-to-end tests into CI/CD pipelines to validate workflows and receive faster feedback on builds. Test Strategy Ideally, end-to-end testing should be performed at the end of the software development life cycle. The majority of the tests should be shifted to unit tests following the integration and service-level tests. Finally, the end-to-end testing should be performed. As per Google’s Testing blog, Google suggests a 70/20/10 split: that is, 70% unit tests, 20% integration tests, and 10% end-to-end tests. The specific combination may vary for each team, but it should generally maintain the shape of a pyramid. In short, unit tests form the base, integration tests come next, and end-to-end testing sits at the top of this structure, forming the shape of the pyramid. Different Stages of End-to-End Testing The following are the three phases of performing end-to-end testing: PlanningTestingTest closure Let’s learn about these phases in detail, one by one. Planning In the planning phase, the following points should be considered: Understand the business and functional requirementsCreate a test plan based on the requirement analysisCreate test cases for end-to-end testing scenarios A tester should gain knowledge of the application and understand the different test journeys within it. These test journeys should be designed from the end user’s point of view, covering the entire process from beginning to end. All the happy paths should be noted down, and accordingly, test cases should be designed. For example, from an e-commerce application point of view, a simple test journey would be as shown in the figure below: Similarly, other test journeys can also be prepared where the user adds the product to the cart and logs out of the application. Then, logs in again and continues from where he left off, and so on. We should also consider the following points in the planning phase to get an upper hand on the testing: Set up a production-like environment to simulate the real-world scenarioSetting up the test data, test strategy, and test cases for testing real-world scenariosDefine entry and exit criteria to have a clear objective for end-to-end testingGet the test cases, test data, entry, and exit criteria reviewed by the business analyst or product owner Testing The testing phase can be divided into two stages, namely, the prerequisites and test execution. Prerequisite In this stage, it should be ensured that: All the feature development should be completeAll the submodules and components of the application should be integrated and working fine as a systemSystem testing should be complete for all the related sub-systems in the applicationThe staging environment, designed to replicate the production setup, should be fully operational. This environment enables us to simulate real-world scenarios and effectively reproduce production-like conditions. It will allow seamless testing of the end-to-end scenarios. After completing the prerequisites, we can proceed to the test execution stage. Test Execution In this stage, the testing team should: Executes the test casesReport bugs in case of test failureRetest the bugs once it is fixedRerun all the end-to-end tests to ensure all tests are working as expected The end-to-end tests can be executed manually or using automation in the CI/CD pipelines. Executing end-to-end tests through an automated pipeline is the recommended approach, as it saves time and effort for the testing team while ensuring high-quality results in the shortest possible time. Test Closure In this stage, the following actions should be performed: Analysis of the test resultsTest report preparationEvaluate the exit criteriaPerform test closure The test closure stage in end-to-end testing involves finalizing test activities and documenting results. It ensures that all the test deliverables are complete. It also includes assessing test coverage and documenting key takeaways, for example, noting down known issues. Finally, a test closure report is prepared for the stakeholders. This report could prove to be of great help in Go/No-Go meetings. End-to-End API Testing Example Let’s take an example of the RESTful e-commerce APIs; there are a total of six main APIs in the RESTful e-commerce application as follows: Create Token (POST /auth)Add Order (POST /addOrder)Get Order (GET /getOrder)Update Order (PUT /updateOrder)Partial Update Order (PATCH /partialUpdateOrder)Delete Order (DELETE /deleteOrder) Before performing end-to-end testing on these APIs, we should first analyze their requirements, usage patterns, and technical specifications. These details will be useful in writing the end-to-end test cases as well as designing the automation test strategy. As per Swagger, the following functional points related to the APIs can be noted: The POST Add Order API is used to create new orders in the system, while the GET Order API retrieves an order using the provided order ID.The Create Token API generates a token that will be used in the Update and Delete APIs as a security aspect, so only registered users can update or delete their orders.The update and partial update APIs will be used for updating the orders.The delete API will be used for deleting the order. Considering these details, the following testing strategy can be used for end-to-end testing: Generate a new token by hitting the POST /auth API and saving it for further use.Create new orders using the POST /addOrder API.Retrieve the newly created order by passing the Order ID in the GET /getOrder API.Using the earlier generated token, update an existing order using the PUT /updateOrder API.Verify the partial update order functionality by updating an existing order using the PATCH /partialUpdateOrder API.Delete the existing order by using the DELETE /deleteOrder API.To verify that the order has been successfully deleted, hit the GET /getOrder API. Here, the status code 404 should be retrieved in the response, considering that the order has been deleted from the system. It can be seen that we used all the major APIs to perform end-to-end testing as a real-world scenario. Similarly, end-to-end testing can be carried out for a web or mobile application. It’s important to evaluate the application from an end user’s perspective, create relevant test scenarios, and have them reviewed by the team’s business analyst or product owner. Summary End-to-end testing is a comprehensive testing approach that validates the entire workflow of an application, from start to finish, to ensure all integrated components function as expected. It simulates a real-world scenario to identify the issues across different modules within the system and their dependencies. It simulates real user scenarios to identify issues across systems and dependencies. This helps ensure the application provides a smooth and reliable user experience and uncovers the issues early before the end users face them. Happy testing!
The rapid ascent of artificial intelligence has ushered in an unprecedented era, often likened to a modern-day gold rush. This "AI gold rush," while brimming with potential, also bears a striking resemblance to the chaotic and lawless frontier of the American Wild West. We are witnessing an explosion of AI initiatives — from unmonitored chatbots running rampant to independent teams deploying large language models (LLMs) without oversight — all contributing to skyrocketing budgets and an increasingly unpredictable technological landscape. This unbridled enthusiasm, though undeniably promising for innovation, concurrently harbors significant and often underestimated dangers. The current trajectory of AI development has indeed forged a new kind of "lawless land." Pervasive "shadow deployments" of AI systems, unsecured AI endpoints, and unchecked API calls are running wild, creating a critical lack of visibility into who is developing what, and how. Much like the historical gold rush, this is a full-throttle race to exploit a new resource, with alarmingly little consideration given to inherent risks, essential security protocols, or spiraling costs. The industry is already rife with cautionary tales: the rogue AI agent that inadvertently leaked highly sensitive corporate data, or the autonomous agent that, in a mere five minutes, initiated a thousand unauthorized API calls. These "oops moments" are not isolated incidents; they are becoming distressingly common occurrences in this new, unregulated frontier. This is precisely where the critical role of the platform engineer emerges. In this burgeoning chaos, the platform engineer is uniquely positioned to bring much-needed order, stepping into the role of the new "sheriff." More accurately, given the complexities of AI, they are evolving into the governance marshal. This transformation isn't a mere rebranding; it reflects a profound evolution of the role itself. Historically, during the nascent stages of DevOps, platform engineers operated more as "cowboys" — driven by speed, experimentation, and a minimal set of rules. With the maturation of Kubernetes and the advent of widespread cloud adoption, they transitioned into "settlers," diligently building stable, reliable platforms that empowered developers. Now, in the dynamic age of AI, the platform engineer must embrace the mantle of the marshal — a decisive leader singularly focused on instilling governance, ensuring safety, and establishing comprehensive observability across this volatile new frontier. The Evolution of the Platform Engineer: From Builder to Guardian This shift in identity signifies far more than just a new job title; it represents a fundamental redefinition of core responsibilities. The essence of the platform engineer's role is no longer solely about deploying and managing infrastructure. It has expanded to encompass the crucial mandate of ensuring that this infrastructure remains safe, stable, and inherently trusted. This new form of leadership transcends traditional hierarchical structures; it is fundamentally about influence — the ability to define and enforce the critical standards upon which all other development will be built. While it may occasionally necessitate saying "no" to risky endeavors, more often, it involves saying "yes" with a clearly defined and robust set of guardrails, enabling innovation within secure parameters. As a governance marshal, the platform engineer is tasked with three paramount responsibilities: Gatekeeper of infrastructure: The platform engineer stands as the primary guardian at the very entry point of modern AI infrastructure. Their duty is to meticulously vet and ensure that everything entering the system is unequivocally safe, secure, and compliant with established policies and regulations. This involves rigorous checks and controls to prevent unauthorized or malicious elements from compromising the ecosystem.Governance builder: Beyond merely enforcing rules, the platform engineer is responsible for actively designing and integrating governance mechanisms directly into the fabric of the platform itself. This means embedding policies, compliance frameworks, and security protocols as foundational components, rather than afterthoughts. By building governance into the core, they create a self-regulating environment that naturally steers development towards best practices.Enabler of innovation: Crucially, the ultimate objective of the platform engineer is not to impede progress or stifle creativity. Instead, their mission is to empower teams to build and experiment fearlessly, without the constant dread of catastrophic failures. This role transforms into that of a strategic enabler, turning seemingly impossible technical feats into repeatable, manageable processes through the provision of standardized templates, robust self-service tools, and clearly defined operational pathways. They construct the scaffolding that allows innovation to flourish securely. Consider the platform engineer not as an obstructionist, but rather as a highly skilled and visionary highway engineer. They are meticulously designing the safe on-ramps, erecting unambiguous signage, and setting appropriate speed limits that enable complex AI workflows to operate at peak efficiency and speed, all while meticulously preventing collisions and catastrophic system failures. The Governance Arsenal: The AI Marshall Stack Platform engineers do not enter this challenging new domain unprepared. They possess a sophisticated toolkit — their "governance arsenal" — collectively known as the AI Marshall Stack. This arsenal is composed of several critical components: AI gateway: Functioning as a "fortified outpost," the AI Gateway establishes a single, secure point of entry for all applications connecting to various LLMs and external AI vendors. This strategic choke point is where fundamental controls are implemented, including intelligent rate limiting to prevent overload, robust authentication to verify user identities, and critical PII (Personally Identifiable Information) redaction to protect sensitive data before it reaches the AI models.Access control: This element represents "the law" within the AI ecosystem. By leveraging granular role-based access control (RBAC), the platform engineer can precisely define and enforce who has permission to utilize specific AI tools, services, and data. This ensures that only authorized individuals and applications can interact with sensitive AI resources, minimizing unauthorized access and potential misuse.Rate limiting: This is the essential "crowd control" mechanism. It acts as a preventative measure against financial stampedes and operational overloads, effectively preventing scenarios like a misconfigured or rogue AI agent making thousands of costly API calls within a matter of minutes, thereby safeguarding budgets and system stability.Observability: These components serve as the "eyes on the street," providing critical real-time insights into the AI landscape. A significant proportion of AI-related problems stem not from technical failures but from a profound lack of visibility. With comprehensive observability, the platform engineer gains precise knowledge of who is doing what, when, and how, enabling them to swiftly identify and address misbehaving agents or unexpected API spikes before they escalate into significant damage or costly incidents.Cost controls: These are the "bankers" of the AI Marshall Stack. They are designed to prevent financial overruns by setting explicit limits on AI resource consumption and preventing the shock of unexpectedly large cloud bills. By implementing proactive cost monitoring and control mechanisms, they ensure that AI initiatives remain within budgetary constraints, fostering responsible resource allocation. By meticulously constructing and deploying these interconnected systems, platform engineers are not merely averting chaos; they are actively fostering an environment where teams can build and innovate with unwavering confidence. The greater the trust users have in the underlying AI infrastructure and its governance, the more rapidly and boldly innovation can proceed. Governance, in essence, is the mechanism through which trust is scaled across an organization. Just as robust rules and well-defined structures allowed rudimentary frontier towns to evolve into flourishing, complex cities, comprehensive AI governance is the indispensable framework that will enable AI to transition from a series of disparate, one-off experiments into a cohesive, strategically integrated product strategy. Why the Platform Engineer Is the Right Person for the Job: The AI Marshal's Unique Advantage Platform engineers are uniquely and exceptionally well-suited to assume this critical role of the governance marshal. They possess the nuanced context of development cycles, the inherent influence within engineering organizations, and the technical toolkit necessary to implement and enforce AI governance effectively. They have lived through and shaped the eras of the "cowboy" and the "settler"; now, it is unequivocally their time to become the "marshal." The AI landscape, while transformative, is not inherently lawless. However, it desperately requires systematic enforcement and a foundational structure. It needs a leader to build the stable scaffolding that allows developers to move with agility and speed without the constant threat of crashing and burning. This vital undertaking is not about imposing control for the sake of control; rather, it is fundamentally about safeguarding everyone from the inevitable "oops moments" that can derail projects, compromise data, and exhaust budgets. It is about actively constructing a superior, inherently safer, and demonstrably smarter AI future for every stakeholder. Therefore, the call to action for platform engineers is clear and urgent: do not passively await others to define the rules of this new frontier. Seize the initiative. Embrace the role of the hero. Build a thriving, resilient AI town where innovation can flourish unencumbered, and where everyone can contribute and grow without the paralyzing fear of stepping on a hidden landmine. Final Thoughts AI doesn’t need to be feared. It just needs to be governed. And governance doesn’t mean slowing down—it means creating the structures that let innovation thrive. Platform engineers are in the perfect position to lead this shift. We’ve been cowboys. We’ve been settlers. Now it’s time to become marshals. So, to all the platform engineers out there: pick up your badge, gather your toolkit, and help tame the AI frontier. The future of safe, scalable, and trusted AI depends on it. Because the Wild West was never meant to last forever. Towns become cities. And with the right governance in place, AI can move from chaos to confidence — and unlock its full potential. Want to dive deeper into the AI Marshal Stack and see how platform engineers can tame the AI Wild West in practice? Watch my full PlatformCon 2025 session here: Discover how to move from cowboy experiments to marshal-led governance — and build the trusted AI foundations your organization needs.
Evolving Golden Paths: Upgrades Without Disruption
October 23, 2025
by
CORE
Applying Domain-Driven Design With Enterprise Java: A Behavior-Driven Approach
October 23, 2025
by
CORE
From Platform Cowboys to Governance Marshals: Taming the AI Wild West
October 22, 2025
by
CORE
Writing (Slightly) Cleaner Code With Collections and Optionals
October 27, 2025 by
Set Up Spring Data Elasticsearch With Basic Authentication
October 27, 2025
by
CORE
Using Schema Registry to Manage Real-Time Data Streams in AI Pipelines
October 27, 2025
by
CORE
Mastering Fluent Bit: Top Tip Using Telemetry Pipeline Parsers for Developers (Part 8)
October 27, 2025
by
CORE
Anthropic’s Model Context Protocol (MCP): A Developer’s Guide to Long-Context LLM Integration
October 27, 2025 by
Enterprise-Grade Document Intelligence: Cloud Big Data AI With YOLOv9 and Spark on AWS
October 27, 2025
by
CORE
Writing (Slightly) Cleaner Code With Collections and Optionals
October 27, 2025 by
Mastering Fluent Bit: Top Tip Using Telemetry Pipeline Parsers for Developers (Part 8)
October 27, 2025
by
CORE
Set Up Spring Data Elasticsearch With Basic Authentication
October 27, 2025
by
CORE
Mastering Fluent Bit: Top Tip Using Telemetry Pipeline Parsers for Developers (Part 8)
October 27, 2025
by
CORE
Using Schema Registry to Manage Real-Time Data Streams in AI Pipelines
October 27, 2025
by
CORE
Writing (Slightly) Cleaner Code With Collections and Optionals
October 27, 2025 by
Set Up Spring Data Elasticsearch With Basic Authentication
October 27, 2025
by
CORE
Using Schema Registry to Manage Real-Time Data Streams in AI Pipelines
October 27, 2025
by
CORE